Skip to main content

Big Data forecast digest for 2013 – predictions from Who's Who of industry

“Predicting the future is … a custom in our industry around this time of the year, and we won’t dodge it.” How apt is this comment from Yves de Montcheuil, VP (Marketing) Talend as it sums up the mood of the first few weeks of the year.

There has been a rush of predictions coming in with stunning quotes and disparate forecasts. Some of them have come in as research reports, others as blogs and a few of them on Twitter.

Sample this-  Tony Baer tweeted: My prediction: 2013 will be the year we deal with the unsexy side of #BigData: data governance, infrastructure mgmt, appdev, etc.

Such influencing statements make you sit up, take notice and put another round of thought on 2013 plans.

hadoopsphere.com takes you through the predictions, forecasts and trends that have been outlined by 9 notable influencers. Scroll through the slide show to read more (incompatible browsers may see the post as one lengthy single slide document). 

2013 Big Data forecast slide digest


“Big Data” becomes “data”

·         “Big Data” becomes “data”

·         Emergence of vertically aligned Apache Hadoop “solutions”

·         “Right-time” query of Apache Hadoop becomes reality

·         More Hadoop startups

·         Apache Hadoop v2 (YARN and MR2) becomes the standard for Hadoop data management

·         The big data ecosystem expands

·         Apache Ambari sets the standard for Hadoop operations




Next Slide>>

Big Data could be TIME 2013 person of the year

·         Firms will realize that “big data” means all of their data.
o        “By the way, some predict the end of the data warehouse — but that’s nonsense. If anything, all forms of data technology will evolve and be necessary to handle the frontier of big data. In 2013, all data is big data.”
·         The algorithm wars will begin. 
o        “Algorithms are the engine to explore virgin data… In 2013, CEOs will give their firms an imperative to to beef up their data science capabilities."
·         Real-time architectures will swing to prominence.
o        “Firms will seek out streaming, event processing, and in-memory data technologies to provide real-time analytics and run predictive models. Mobile is a key driver”
·         Naysayers will fall silent.
o        Time magazine will name big data its 2013 person of the year.”



<<Previous Slide
Next Slide>>

Chasing shinny objects while waiting for technology bake-offs


·         create a new breed of applications - Intelligent Applications…It's about using data to make our customer touch points more engaging, more interactive, more intelligent.”
·         market segments by technology that will have highest growth:

o        Data Analytics as a Service (or also referred to as Big Data as a Service (BDaaS))
o        Business Intelligence as a Service
o        Logging as a Service.

·         Challenges that end-user organizations will struggle with the most in 2013:
o        “who owns the platform powering their much-needed data-driven applications
o        Ultimately, end-users will be forced to chase "shinny objects" because IT groups will persuade them to wait for the "technology bake-offs" around the Big Data platform soon to be launched (24 months from now)
o        In the end, many organizations will fail at creating value from Big Data due to a lack of focus on business problems, time-to-market, and in some cases the wrong technology choice”
·         You will need the following technology components:
o        Real-time stream processing
o        Ad-hoc analytics (see NoSQL and NewSQL data stores)
o        Batch Analytics
“Not one, but all three!”
·         “Cloud will become a large part of big data deployment - established by a new cloud ecosystem… elastic big data clouds behind the firewall and within trusted third party data center providers.”


<<Previous Slide
Next Slide>>

From monolithic to hybrid platforms

·         Prediction #5) We will all know at least one colleague who is bragging about a Petabyte stockpile of new data
The challenges are the costs of storage, the administrative overhead of managing this much data, and bringing enough computation to the data in a way that we can reasonably filter, organize and analyze the data.
·         Prediction #4:  ‘Delete’ will become a forbidden word 
if we don’t keep that information we are precluded from doing whatever insightful analytic that could have been the “killer usecase”
·         Prediction #3: There will be a mad dash for software-defined storage
The chase will come from multiple dimensions:
o        Software-defined SAN
o        Software-defined NAS, in particular scale-out NAS
o         Software-defined Object stores
·         Prediction #2: The default infrastructure for Big Data will change
o        We should expect a tipping point in network infrastructure, 10GBE networks and high-bandwidth switch topologies… In 2013, data and compute can be anywhere in a switch domain with little or no performance difference.
o        Additionally, the decreasing cost of flash and the increasing availability of software to take advantage of multiple tiers of storage will mean that flash will be an integral part of every storage architecture.
·         Prediction #1: The focus on big data use cases will shift heavily towards real-time
“the need shifts from a monolithic map-reduce powered platform to a hybrid of real-time, batch and machine learning…”


<<Previous Slide
Next Slide>>

Mundane and mainstream Hadoop


·         2013 will see the rise of integration platforms designed to support and be deployed in complex hybrid environments, spanning on-premises and cloud-based.
·         After the success of experimental deployments, fueled by an accelerated maturity curve, Hadoop will gain mainstream acceptance in 2013.
·         Away from the hype of data science, big data platforms will be used in 2013 to offload “mundane” tasks that can benefit from extreme and inexpensive scalability.
·         In 2013, big data will drive part of the requirements for MDM programs, recognizing that new types of data are becoming constituents of enterprise information. 


<<Previous Slide
Next Slide>>

Big Data on every company’s shortlist

RainStor

·         Prediction 1: Enterprise Big Data Initiatives Move out of the Sandbox and Define a Clear Set of Business and Technology Requirements
·         Prediction 2: Companies will Look to New Technology Combinations, other than Hadoop, when Managing Big Data
·         Prediction 3: Budget Limitations will pose one of the Biggest Hurdles to Solving Big Data Challenges
“According to a recent analyst report, Big Data spending is projected to drive $34 billion in 2013. Organizations in particular sectors must keep data online and available because compliance regulations dictate so, and additionally businesses want to harness more raw data from multiple sources in order to conduct better analytics. Getting the balance right – finding the most efficient technology infrastructure while satisfying business demands – is the challenge.”

“The tolerance for $20,000 – $40,000 per terabyte of managed data is fast disappearing in the minds of enterprise decision makers.”
·         Prediction 4: Big Data Tools Must Satisfy both Business and Technical Users
·         Prediction 5: Heavyweights, such as Oracle and IBM, will Make Acquisitions in the Big Data Market
“Over the next 12 months, Big Data will take the spotlight because it is on every company’s short list.”


<<Previous Slide
Next Slide>>

 4.4 million jobs in Big Data by 2015



  • By 2015, big data demand will reach 4.4 million jobs globally but only one third of those jobs will be filled. The availability of skills will remain an issue and will not improve in the short term.
  • More organizations would focus on reuse and sharing information infrastructure in a strategic manner, seeking to rationalize overlapping and redundant tools
  • There would be two primary reasons that Big Data will lose its glamour by 2015:- Big Data is an overused and misused term.- The differentiation between IT vendors that manage and analyze Big Data versus those that do not will diminish. The term ‘Big data’ will become “ubiquitous” becoming simply data
  • There will be a rapid adoption of distributed processing such as MapReduce as an extension of the data warehouse across all companies

<<Previous Slide
Next Slide>>

Expect the unexpected

IDC


  • Spending on Big Data technologies and services will cross $10 billion in 2013
  • Mergers and Acquisitions in visual discovery, predictive analytics and text/rich media analytics areas.
  • VC funding shift towards analytics and discovery tools and analytic applications.
  • More focus on graph analytics
  • Big Data analytics services will see a surge in demand
  • End users will seek services for a more consultative role of a trusted advisor.
  • Expect the unexpected.
<<Previous Slide
Next Slide>>

Beware of Hadump and rise of the Stacker

GregoryPiatetsky-Shapiro -  Editor, KDnuggets.com



Paige Roberts @RobertsPaige Data dumped into Hadoop w/ no plan RT @sapdatatech: Word of the day: Hadump. Def: Analytical Sandbox @nancykoppdw #datahangout #bigdata

IBM big data @IBMbigdata Ha! RT @furukama: Let's call it Stacker RT @SAPDataTech Data Scientist is a combo of a statistician, a hacker + MBA @kdnuggets #datahangout

Hot Areas:

·         Mobile Data from iPhones, tablets, cars

o        also Mobile BI consumption

·         M2MMachine to Machine, Sensordata

·         Social Networks

·         Machine Learning Successeson Big Data

·         Verticals: Energy, Utilities, Healthcare, HR

·         Big Analytics: in- memory DB, next Hadoop?

·         Privacy or lack of it?


<<Previous Slide

You may also download this report here (.pdf document).

Comments

  1. our company http://www.sqiar.com/ which provide services like data warehousing,Business Intelligence,SQL support,all these services can help the company to present there Data in meaningful way.so user take better and informed decisions.and also in data warehousing.its include Data mining,Data modelling,and ETL services.which help the company to reduce the complexity environment.

    ReplyDelete

Post a Comment

Popular posts from this blog

Low latency SQL querying on HBase

HBase has emerged as one of the most popular NoSQL database offering distributed, versioned, non-relational tables hosted on commodity hardware. However, with a large set of users coming from a relational SQL world, it made sense to bring the SQL back in this NoSQL. With Apache Phoenix, database professionals get a convenient way to query HBase through SQL in a fast and efficient manner. Continuing our discussion with James Taylor, the founder of Apache Phoenix, we focus on the functional aspects of Phoenix in this second part of interaction.
Although Apache Phoenix started off with distinct low latency advantage, have the other options like Hive/Impala (integrated with HBase) caught up in terms of performance?
No, these other tools such as Hive and Impala have not invested in improving performance against HBase data, so if anything, Phoenix's advantage has only gotten bigger as our performance improves.  See this link for comparison of Apache Phoenix with Apache Hive and Cloudera Im…

Data deduplication tactics with HDFS and MapReduce

As the amount of data continues to grow exponentially, there has been increased focus on stored data reduction methods. Data compression, single instance store and data deduplication are among the common techniques employed for stored data reduction.
Deduplication often refers to elimination of redundant subfiles (also known as chunks, blocks, or extents). Unlike compression, data is not changed and eliminates storage capacity for identical data. Data deduplication offers significant advantage in terms of reduction in storage, network bandwidth and promises increased scalability.
From a simplistic use case perspective, we can see application in removing duplicates in Call Detail Record (CDR) for a Telecom carrier. Similarly, we may apply the technique to optimize on network traffic carrying the same data packets.
Some of the common methods for data deduplication in storage architecture include hashing, binary comparison and delta differencing. In this post, we focus on how MapReduce and…

Pricing models for Hadoop products

A look at the various pricing models adopted by the vendors in the Hadoop ecosystem. While the pricing models are evolving in this rapid and dynamic market, listed below are some of the major variations utilized by companies in the sphere.
1) Per Node:Among the most common model, the node based pricing mechanism utilizes customized rules for determining pricing per node. This may be as straight forward as pricing per name node and data node or could have complex variants of pricing based on number of core processors utilized by the nodes in the cluster or per user license in case of applications.
2) Per TB:The data based pricing mechanism charges customer for license cost per TB of data. This model usually accounts non replicated data for computation of cost.
3) Subscription Support cost only:In this model, the vendor prefers to give away software for free but charges the customer for subscription support on a specified number of nodes. The support timings and level of support further …