Almost any definition of big data is based on Doug Laney’s original preposition of the three Vs – velocity (the data is growing rapidly), variety (the data comes from many sources – both structured and unstructured) and volume (the data is big), says Gary Allemann, MD of Master Data Management.
However, there are challenges with the three Vs definition for big data. Furthermore, many organisations believe that big data is business intelligence (BI). Understanding these terminologies will ensure a business is able to harness their big data to obtain intelligent insights.
Gartner also defines big data as “high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimisation.” One again, there are two principle challenges with the three Vs definition for big data:
* The three Vs are focused on the technical characteristics of big data. This is useful as a starting point however, not necessarily good for the maturing market. Big data must be understood as a business concept rather than as a technical concept if it is to add value in any organisation.
* The “and/or” nature of the definition is the cause of additional confusion. Any vendor with a solution that handles any one the three Vs – for example, an existing database that can handle large volumes – is positioned as being a big data solution. Many vendors will focus on the one V that suits them and ignore the ‘that requires new forms of processing’ part of this definition. In almost every case, big data must have at least two of these characteristics.
Adding to the confusion of the three Vs for big data, additional V’s have been added to continue the V theme, whilst improving the underlying definition. Most commonly, these include Veracity (the data must be accurate) and Value (the data must be of business significance).
The addition of these V’s touch on some of the technical aspects of governing and managing big data, however, they do not really address the challenges inherent in the initial definition. In order to move forward in a maturing market, one should ask the question – is it time to forget about the three Vs?
In order to let go of the three Vs, organisations need to review one approach – to look at what big data is not. Many organisations still believe that big data is BI –it is not. Big data solutions provide insight by inferring laws discovered by analysing large sets of data with low information density. BI, in contrast, is about analysing data with high information density to discover trends, measure compliance to known rules, etc.
Newer definitions focus on these differences, for example, Wikipedia describes big data as “an all-encompassing term for any collection of data sets so large and complex that it becomes difficult to process them using traditional data processing applications.” However, as with the three Vs definition, this remains largely technical and somewhat vague.
Newer approaches to big data analytics make big data solutions quicker and easier to deploy than traditional BI solutions. They leverage the unstructured nature of Hadoop to allow data to be quickly and easily assimilated and analysed.
Big data answers the “what if” questions.
Currently, many business decisions are based on ‘gut feel’ or intuition. This is not due to a lack of interest in data-driven decision making. In fact, most large companies have invested hundreds of millions in data warehousing and BI projects with the view to improve analytics and decision making.
However, the simple reality is that BI does not support forward thinking decision making. For example, an executive that says “I have this gut feeling… Can you build me a model to test my hypothesis” will wait months for IT to define data models, integrate data, build the analyses and answer the question. The costs incurred in answering this kind of question is not viable. In any case, long before the answer is given the decision, right or wrong, has been made and the business has moved on. No wonder BI is often dismissed as “answering yesterday’s questions tomorrow”.
Big data’s strengths lie in answering these kind of ‘what if’ questions. Unsurprisingly, many big data applications lie in the area of customer insight. Companies that wish to improve their customer experience, maximise pricing, optimise their channel mix or reduce fraud cannot answer the ongoing ‘what if’ questions that arise for each proposed change to existing strategies and approaches using traditional BI.
Big Data approaches that simplify the complexity of managing data allow business to answer today’s questions while they are still relevant.
Big data is about providing rapid time to insight for questions that cannot be cost effectively answered using traditional BI. Organisations need to move away from the three Vs definition and towards a more business-orientated definition of big data – reducing data complexity to provide rapid time to insight.