I love data visualization, but it's not the only way of giving your data a voice.
From pen and paper through Excel, Google Viz, R, and D3, visualization brings out the flavor of your data. I believe it should play a role at every stage of data analysis: use it to picture the process of data collection, to understand how data moves through your system, to clean data, to find patterns, and to show results. But visualization is not the only way of giving your data a voice.
Statements, models, and anecdotes are three other means of communicating the message of your data -- both to others and to yourself, for your own understanding. Letís study how each of these could be applied by using a very simple example.
The case study
Research youíve done has found that, by taking into account the weather forecast, a supermarket would be better able to predict (and thus satisfy) customer demand. Youíve looked at several key product lines, youíve run some heavy machine learning tasks, and your hard-drive and head are filled with numbers. Itís time to take the message to the C-level and get approval for a new ordering process to go into production. What do you present?
Statement By taking weather forecasts into account we will capture $20 million in sales we would have missed.
Anecdotes can be as effective as visualizations in communicating big data insights
Itís simple and quantitative, and this grabs the attention of the C-level. But donít think that the utility of simple statements like this is only useful in presenting your findings; hold on to pithy thoughts as you explore the data, and use them to motivate your team and maintain your footing as you move through massive analytic tasks.
While powerful -- and probably the thing you should lead with in this scenario -- it invites a question how? I would recommend a graph at this point -- perhaps a time-series of the volume of demand for weather-affected products, with the temperature shown on a secondary axis, but then back that graph up with something else.
Despite several hundred of years of success in general science, models are somewhat unfashionable in data science. Using machine learning bypasses the need to construct models of cause and effect, and this can make it much quicker to get to results. But the brain deals very well with understanding models of cause and effect and much less so with understanding probability trees. Even if you arrived at your results through machine learning, that doesnít mean the results canít be described in model terms.
Hereís a simple example: ďFor every one degree above 25 degrees, our customers buy $100,000 extra ice cream per day, but often we run out.Ē A model is quantitative, usually implies some causal connection, and itís battle-tested by hundreds of years of science.
The plural of anecdote is not data. Nonetheless, datasets are often replete with anecdotes that can aid your understanding. Try this: ďDuring the heat wave last May, we ran out of ice cream entirely in 20 stores. Our staff told us they were frequently being asked to look in the backroom.Ē Maybe your CEO even tried to buy ice cream that week. This is a story thatís easy to relate to, and it also shows a direct link between your data and satisfaction of your customers.
Anecdotes might seem anathema to data, but theyíre not. In fact, pretty much every visualization is just an anecdote writ large and colored in. In the context of big data, it is extremely rare that an entire dataset can be encapsulated in a single visualization. Drawing on just a portion of the data produces results anecdotal in the sense that they canít encompass everything.
Use of anecdote is common in journalism -- including data journalism -- and drawing on this technique can not only help the C-level understand what youíre doing, it can aid your own thinking. Use anecdotes, but use them with caution, being mindful that each one misses huge swathes of data.
User Rank: Exabyte Executive 1/30/2013 | 8:45:23 AM
Re: on anecdotes The challenge with using anecdotes in the workplace is that you can only use them so many times and have so many before they become counterproductive. The user/business will be constantly searching for new anecdotes to make the same point.
I can think of one corrupting development from this. Remember Bernie Madoff? He probably started with some great results for clients and the "anecdotes" of those successes pulled in new customers, who in turn needed more anecdotal evidence. In the end, Madoff started making up "anecdotes" and we all know how that ended. Don't we?
Re: on anecdotes I like it @Anna - 'Refined Data' - it instantly sounds more trustworthy too. To be able to pull this from someone's insight is certainly key... but 'anecdotal evidence' can sound too "one off".
User Rank: Exabyte Executive 1/28/2013 | 6:12:04 AM
Re: on anecdotes The anecdote is best described as "refined data." It's always been the best way to consume information and get the most out of another person's or enterprises' experience. If only we could treat some data points this way. We could then eliminate some of the time spent analyzing what we really don't need.
User Rank: Exabyte Executive 1/26/2013 | 7:59:36 AM
Re: on anecdotes Executives care about insights, but those insights must meet some type of criteria to get their attention, not to mention buy-in for your big data level projects. Anecdotes servers this purpose well to get a foot in the door.
User Rank: Exabyte Executive 1/25/2013 | 9:49:45 AM
Re: on anecdotes Good point @Saul, but, as you said in your other post, the anecdote has got to lead to something of value for the CIO, because the CIO has to sell it to the CEO. The anecdote can't just lead to more hype.
Re: on anecdotes The worrying thing with journalism is that its those hyped up anecdotes that shift units. All the 'data' might be contained in the article, or research, but the headline ends up being one anecdote more exciting than the rest. Not a lie, but unrepresentative of the overall truth.
While that model works for selling news papers - the model wouldn't translate to business use of data. If you get the c-level to buy in on an anecdote which represents a sliver of the overall facts... when the end results are shown, you will be held to your word.