What is Big Data?
So what is Big Data, simply put, big data is data that’s too large or complex to be effectively handled by standard database technologies currently found in most organisations.
According to the conventional definition, for data to be regarded as “big”, it should possess a number of key attributes – volume, velocity and variety:
- Volume is just what it sounds like: lots of data. To put this in context, YouTube users upload 48 hours of new video every minute of every day.
- Velocity occurs where the data is time-sensitive and needs to be processed and stored quickly. One example is the real-time profiling of internet display adverts that are customized according to your usage pattern.
- Variety covers the various forms that data can take, from neatly-structured tabular data, to unstructured data containing items such as images, emails, spreadsheets, social media conversations and streaming media. Currently, there is no universally accepted “one-size-fits-all” approach to handling this data variety.
What about the other V’s?
Given the prerequisite Vs of Volume, Velocity and Variety, that brings us to some of the the other, lesser considered but probably more important Vs that can be associated with Big Data, these being Validity, Veracity, Value and Visibility. If the result of your Big Data processes is critical to you in your business, you may want to ensure these additional 4 Vs are rigorously assessed throughout your Big Data processes:
- Validity - the interpreted data having a sound basis in logic or fact – is a result of the logical inferences from matching data. One of the most common errors being the confusion between correlation and causation.
- Volume -Validity = Worthlesness?
- Veracity - conformity to facts; accuracy – Do we need a spell checker to get data consistency?
- Big Data – Veracity = Incorrect inferences being drawn?
- Value -the importance, worth, or usefulness of the data to those consuming it – is probably the most relevant to organisations. Data in and of itself has no value.
- Big Data = Data + Value?
- Visibility - the state of being able to see or be seen – is implied. Data from disparate sources need to be stitched together where they are visible to the technology stack making up Big Data. Critical data that is otherwise available, but not visible to the processes of Big Data may be one of the Achilles Heels of the Big Data paradigm. Conversely, unauthorised visibility is a risk.
- Big Data – visibility = Black Hole?
In summary, "What is Big Data", it's just like the Cloud and other business technology ecosystems and has it’s place, and can be a game changer and deliver real value. Trick is to ensure your Big Data delivers what’s expected and is not only a Black Hole for cost and effort, but most importantly, also does not elevate your risk profile resulting from incorrect decisions making because some of the key Vs were not clearly understood.