Additional dimensions of Big Data - discussion revived once more


We have earlier as well heard of Big Data dimensions beyond the conventional 3 popularized by IBM and the akin: Volume, Variety and Velocity. The discussion is now revived once more after Mark Beyer, Gartner Research VP, addressed the gathering at Oracle Open World 2012 and brought his 12 dimensional framework for Big Data to the fore.

(Based on what has been heard, during the same session George Lumpkin, Vice President, Product Management also talked in grand Oracle fashion about Oracle’s In-Database Hadoop claims while the purists whispered their genuine queries on real time processing.

However, besides the hype was the informative talk from Mark who was co-hosting the session with George.)

To understand better what Mark said, let’s dust off a Gartner report from 2011 which says that Big Data is only the beginning of extreme information management.

According to Gartner’s research team comprising of Mark Beyer, Anne Lapkin, Nicholas Gall, Donald Feinberg, Valentin T. Sribar,  Big Data has more often than not been defined under various terms which include Real-time data, Shared data (data shared across apps), Linked data (inter-related data from various sources) and High-fidelity data (containing context, detail, relationships and identities of important business info).

Extreme information management, according to them, pertains to dealing with issues across a dozen different dimensions in three categories: quantification, access and quality assurance.

Quantification:
Volume of data
Velocity of data streams, access demands and record creation
Variety of data formats
Complexity of individual data types (standards, domain rules, storage formats for each asset type)

Access enablement and control:
Classification (sensitive/non-sensitive, private/public classifications and so on)
Contracts (agreements on who will share information and how)
Pervasiveness (how long does data remain active)
Technology-enablement (specifications for tools and technology)

Qualification and assurance:
Fidelity (ability or inability to confidently adapt an asset for wider use)

Linked data (data in combination and the uses related to this context)

Validation of data (ensuring data is valid per use case)
Perishability (longevity, aging of data while retaining its state and character)


In a nutshell, the Extreme Information Management framework definitely poses a unique opportunity to assess the key challenges faced in the Big Data explosion. CIOs must recognize the signs of these challenges while tapping data for a competitive advantage.  From a pure analyst perspective, we know there is no single tool that fits the bill per this framework.  Today it requires a huge environment to be in place to meet these needs and challenges posed by the Big Data world. We just hope as the ecosystem matures up, we can start putting more tick marks on each enterprise’ EA against this dimension list.