Curation, that is filtering by topic or keyword, has its role, but the reality is that aggressive filtering using terms, sentiment, authority, and location only has the effect of cutting the hundreds of millions down to tens of thousands, a number that is still unmanageable from a human perspective. And as to readability, short lists are still lists.
The question comes down to what is the goal? That is, what insight do we want to draw from the stream and how do we want to communicate it?
Of course, at Narrative Science, our view is that we want to transform the massive stream of data that flows through the firehose into stories that are human readable and express the insights that are hidden within the stream. In order to do this, we have to track, filter, tag and organize the unstructured stream into a semi-structured data asset that can then be used to support automatic narrative generation.
Our first foray into this work has been to look at the twitter traffic related to the Republican primary candidates. Using a focused data stream, our technology captures and tags the ongoing conversations and then transforms the resulting data into stories. Our first story type is focused on how the candidates are trending and what topics are the drivers behind those trends. Linking the stream to events in the world, the primaries themselves, our engine can produce a daily report that captures a snapshot of where the candidates are and what issues brought them there.
While it is still in beta, we thought it might be nice to provide a peek of what is coming with regard to how we are using an ongoing stream of tweets to generate stories that express the state of the world in a form that is ever so slightly more human.
NEWT GINGRICH GAINS ATTENTION WITH HOT-BUTTON TOPICS TAXES, CHARACTER ISSUES
Newt Gingrich received the largest increase in Tweets about him today. Twitter activity associated with the candidate has shot up since yesterday, with most users tweeting about taxes and character issues. Newt Gingrich has been consistently popular on Twitter, as he has been the top riser on the site for the last four days. Conversely, the number of tweets about Ron Paul has dropped in the past 24 hours. Another traffic loser was Rick Santorum, who has also seen tweets about him fall off a bit.
While the overall tone of the Gingrich tweets is positive, public opinion regarding the candidate and character issues is trending negatively. In particular, @MommaVickers says, "Someone needs to put The Blood Arm's 'Suspicious Character' to a photo montage of Newt Gingrich. #pimp".
On the other hand, tweeters with a long reach are on the upside with regard to Newt Gingrich's take on taxes. Tweeting about this issue, @elvisroy000 says, "Newt Gingrich Cut Taxes Balanced Budget, 1n 80s and 90s, Newt experienced Conservative with values".
Maine recently held its primary, but it isn't talking about Gingrich. Instead the focus is on Ron Paul and religious issues.
It is only the beginning, but we see this as the first step in wrangling the firehose and turning the stream into stories.