Wednesday, January 25, 2012

Stories are the Last Mile in Big Data

Conventional wisdom says that in order to understand anything in business, you need to track it. When it comes to sales, logistics, customer service, employee performance, call centers and all of the other issues that drive a business, knowing how you are doing is the first step in understanding how to do it better.

Fortunately, the rise of hyper, low-cost computing and storage, combined with the drive towards bringing more of our data online, has given us a world in which we can now monitor and measure nearly every aspect of running a business. On top of this, we now see a huge surge of information coming from the social sphere that can be harvested and harnessed for business intelligence purposes.

World of Big Data
But data is meaningless unless it can be converted to insight. Holding onto the record of every call into your customer service center and all of your product returns is not helpful unless you can establish a correlation between the different dimensions that each captures. Shipping and delivery records make no sense if they are not linked to the features that contribute to on-time performance. Knowing about error rates in production is of little help if you don’t also have records of raw materials and parts from different vendors and work shift information.

Data alone isn’t the answer. In fact, from a business perspective, the data is still part of problem. Insight is the answer, which is derived from the data.

As I write this, numerous organizations, both commercial and research, are attacking this problem. There are substantial efforts aimed at using data mining, correlation analysis, machine learning, recommendation engines and anything else that people can think of to solve the problem of understanding the relationships between the data elements that will help to inform and transform business practices. However, these approaches are often driven by what is doable and interesting from an engineering perspective, rather than useful and readable from a business user’s point of view.

Even when Big Data is drawn down to a usable form which, for lack of a better term, we can call “small data”, the question still remains: “How do I communicate with my users?”

Tables are fine, but they do tend to be hard to deal with regardless of the size of the data sample. Looking at the table below, which is derived from point-of-sale information consisting of millions of rows of data, there is nothing exciting that immediately leaps to mind.

Even when you are motivated to do so, how do you approach reading a table like this without risking a rapid and dramatic loss of interest?

Of course, visualization is often touted as the solution, but is a chart based on these numbers that much better?
While relationships can be drawn out between both the number in the table and bars on this graph, there is still the issue of the level of effort and attention that it takes to pull out the simplest of observations.

So, how about this instead:

Store 9, your sales of Item 6 are far below the other stores in your region. If you are able to up your sales of this product by only 5 units a day, you will be able to increase your profits next month by $1,123. The sales of this product for other stores in your region seem to indicate that this is completely achievable.

Of course, I am cheating here because I am also including data associated with the profit margins of these products that are presented in other tables/charts on other pages not included here. But that is the reality of reports. Also, the message is aimed at a particular store, but isn’t that who should be reading this message anyway? For any pool of data, there are always going to be multiple stakeholders, and they should each be receiving their own targeted messaging.

The point is simple. Data is not the target. Data is not the answer. Data is not the insight.

Rather, data is the enabler for the real target: Insight that is communicated to the right person at the right time, in the right way. And although the above story is short, it is clear, clean and to the point. And it is aimed at a business problem that can be addressed through the appropriate analysis of the data and, just as important, the appropriate communication of the message.

I love Big Data. The move towards Big Data has the potential to change the way we do everything. But the last mile has to be the Story; the Story that communicates what is happening in the world, and what needs to be done to fix the problems and exploit the opportunities that analysis exposes.

Of course, this assumes two things:

1. You have some idea to begin with of what stories you want to tell. If you have no idea of what you want to achieve from your data, the likelihood of getting to something of value is low at best. Of course, there are counter examples to this rule, but they tend to be few and far between. But, if you know what sorts of potential stories there are, what you want to get from the data, then the analysis required to get the insights hidden in it can be focused and effective.

2. You have a technology that allows you to transform that insight at the data level into crisp, clear and focused reports. As Narrative Science has demonstrated, this capability is not only possible, but is actually practical at a level that enables the generation of stories from data at tremendous scale. Going from data, to insight, to 10,000 individuated reports on a daily basis is not an idea, but a reality.

So, while the Story is the last mile in Big Data, it is also the first step. Knowing the stories you want to tell gives you the focus to do the analysis that will allow you to tell them. With the Story, Big Data becomes the transformational force that we all want it to be.