In a recent blog post Stephen Wolfram, creator of Mathematica, author of “A New Kind of Science”, creator of Wolfram|Alpha, and founder and CEO of Wolfram Research, wrote how, over a long period of time, he amassed “probably one of the world’s largest collections of personal data“. In the post, he walks through various analyses he recently performed on this data.
In my opinion the results of these analyses hold an interesting, cautionary tale for people working in the new world of Big Data, where there is a risk of analysing data just because it is there.
My assessment of the various results of Stephen’s analysis would be:
- 95% obvious – (e.g. “there’s been a progressive increase in my email volume over the years”, “peaks [in email] are often associated with intense early-stage projects, where I am directly interacting with lots of people” )
- 5% interesting but not useful – (e.g. “7% of all keystrokes are backspaces”, “a large volume of Stephen’s work has been done between midnight and 6am”)
The working-at-night observation might be more interesting if data were analysed across many people to see if there was a correlation between not requiring much sleep and success. Alas, even if this were true it would not be that useful (unless you believe that you can train yourself to require less sleep).
There is one area which might be useful, that is the analysis of the amount of time Stephen worked on his book “A New Kind of Science“; the data here may help him, in the future, estimate more accurately how long another book would take to write but, ironically, it is unlikely this would help him write another book any faster.
One would have hoped that with all this data, collected over so many years, and studied by someone with such an analytical mind, that there would be some usable insights, ones which could be acted on to make a change for the better … but, alas it would seem not.
And, that is the cautionary tale for the BI world
Be wary of “analysis for its own sake” or those suggesting that expensive Big Data projects “don’t need business requirements” because they are “finding insight that wasn’t known about before”. All too often, these types of projects produce interesting results but no useful information (and by useful I mean actionable, i.e. information which can be used to drive action and therefore hope to change something).
Don’t get me wrong, I am as convinced as everyone else that the analysis of “Big Data” will provide many valuable insights over the coming months and years. I am just concerned that if we plunge headlong into it, without thinking, the amount of time, money and effort wasted will far outweigh the benefits.
As I have said before, with any business intelligence / analytics project, it should always come down to business requirements. Always have an idea of what you are looking for and why. A project to “analyse our web site data to understand if there is a better way of laying out our site to keep people on it for longer” is many, many more times more likely to produce a tangible result than a project to “analyse clickstream data to see if there is anything interesting we didn’t know”.
Perhaps I am being too cynical, if I am I would love to hear any stories about a data analysis project which produced something truly unexpected that was used to make a significant business impact (apocryphal stories about beer and diapers/nappies need not apply).
For more thoughts on big data, particularly as it affects dashboards, watch my on-demand webinar here.