Top Big Data Influencers of 2015

2015 was an exciting year for big data and hadoop ecosystem. We saw hadoop becoming an essential part of data management strategy of almost all major enterprise organizations. There is cut throat competition among IT vendors now to help realize the vision of data hub, data lake and data warehouse with Hadoop and Spark.

As part of its annual assessment of big data and hadoop ecosystem, HadoopSphere publishes a list of top big data influencers each year. The list is derived based on a scientific methodology which involves assessing various parameters in each category of influencers. HadoopSphere Top Big Data Influencers list reflects the people, products, organizations and portals that exercised the most influence on big data and ecosystem in a particular year. The influencers have been listed in the following categories:

  • Analysts
  • Social Media
  • Online Media
  • Products
  • Techies
  • Coach
  • Thought Leaders


Doug Henschen It might have been hard to miss Doug Henschen writing for InformationWeek. With his accomplished media experience and proven expertise in industry analysis, Doug has now joined Wang at Constellation Research talking about big data. His current focus areas include good data, streaming, cloud solutions and self-service of data.

Merv Adrian The saner voice on big data in the important research firm Gartner, Merv Adrian makes sure we make sense out of the dichotomy between data warehouse and data lake. He understands the breadth and depth of Hadoop ecosystem and provides the vision to cross the hype.

Tony Baer When you talk to Tony Baer, don’t expect rebel thoughts just plain incisive wisdom unravelling with each statement. More prose looking like poetry, the analysis casts an indelible effect on your understanding of the big data ecosystem. He remains top of the Hadoop analyst game for many years in a row now.

Social Media:

Bernard Marr Bernard Marr is an author, speaker and consultant with wider interests in strategic performance, analytics, KPIs and big data.  He is the founder of Advanced Performance Institute and provides consulting to various organizations. Bernard has a massive following on Twitter and his LinkedIn posts' generate huge interaction and interest.

Cloudera Cloudera is the market leader in Hadoop distros and at the same time continues to influence social media followers. It may not have the most number of followers compared to other companies but most of it’s messages gets the right amplification and impact. Kudos to Cloudera social marketing team.

Gregory Piatetsky-Shapiro As the President of KDnuggets, Gregory is a founder of KDD (Knowledge Discovery and Data mining conferences). His social media messages attract the right amount of traffic and eye balls making him one of the most relevant social media influencers.

Online Media:

O’Reilly Media O’Reilly Media is a diversified group now with interests ranging from books to blogs, webinars to conferences. With Strata Hadoop World as one of its most visible product now after books, O’Reilly media is definitely shaping up the big data opinion in the industry.

TDWI With research papers, blogs, webinars and education events, TDWI continues to attract impressions and leads for marketers.

The Cube The Cube is a pioneering online television series filmed at various industry events. It brings the best minds on the show speaking up the future of big data. Chic image setting television, it boasts of the CxO speakers like no other forum can.


Actian Vortex Actian Vortex is one real sharp SQL in Hadoop product which brings the best of database SQL to Hadoop and YARN world. With innovative engineering under the hood to support ACID transactions and higher performance, it has motivated quite a few solutions in its arena.

Apache Flink Apache Flink started off a research product and soon created a unique identity for its streaming capabilities. It has influenced quite a few features in other competing streaming products like off heap memory management, datasets and the like.

Kyvos Insights Kyvos Insights is an OLAP product building cubes at big data scale while assuring low latency SLA on Hadoop. With pre-canned cubes, interactive queries on terabytes of data within 2 seconds is a real possibility and an eye catcher. As the trendsetter for cubes on Hadoop, it has inspired a few other imitations on its trail but none at par so far.


Reynold Xin As one of the co-founders of Databricks and Apache Spark, Reynold Xin continues to influence major innovations in Spark including Tungsten memory management, Dataframes and many more. Sharp and futuristic, he is a real tech force.

Roman Shaposhnik With the Open Data Platform (ODP), Roman Shaposhnik has got a new home for corporate Hadoop and continues to lead the initiative magnificently. Pushing many other Apache projects alongside like BigTop and acting as mentors to others like Ignite, Roman emerged as a true tech leader in last year.

Todd Lipcon When Todd brought HA to Hadoop, he brought Hadoop to the enterprise infrastructure. When Todd Lipcon has brought Kudu to Hadoop, he has brought Hadoop to the enterprise database. Believe it or not, but Todd has unassumingly and unwittingly become the enterprise champion for Hadoop.


Paco Nathan If you are looking for a Spark session in an industry event or on online resources, chances are you have may have attended one of Paco Nathan’s session. Evil mad scientist as he likes to proclaim himself, he is much more than Spark and lot of data science, maths, venture capital and learning coach among his many-many interests.

Shane Curcuru Community over code and Apache open source over corporate proprietary, Shane Curcuru has been evangelizing Apache for years now. As one of ASF directors, he ensures Apache brand name is taken care of in right measure and the community driven projects get their right share of sun.

Thought Leaders:

Ion Stoica As one of the main founders of Apache Spark, Ion Stoica already has rallied the entire data world around one product. However, his vison with Databricks does not seem to be confined to just a batch execution engine. It seems Databricks is out there to get a bigger share of data center with its cloud offerings and the innovations continue rolling in at an unprecedented velocity.

Mike Olson As the Chief Strategy Officer and Chairperson of Cloudera, Mike Olson has made sure Cloudera remains at the top of the Hadoop game. Resisting off the market IPO or acquisition bait and maintaining the innovation path, he has been keeping Cloudera a steady ship.  Open to disruptions like Spark and embracing partners, he has been one true leader who thinks and acts with vision and authority.

<< HadoopSphere Top Big Data Influencers of 2014