Hadoop, the open source framework for distributed processing of large data sets, is still taking shape as a "big data" technology, according to business intelligence and analytics consultant Wayne Eckerson. "The primary challenge right now is that it's really a 1.0 environment, so it's buyer-beware," Eckerson says in this video interview recorded at SearchBusinessAnalytics.com's Big Data Insights seminar in Chicago in August 2012.
But that shouldn't scare away organizations from deploying and using Hadoop as part of big data systems for collecting, managing and analyzing various types of unstructured data, adds Eckerson, who is director of research for the Business Applications and Architecture Media Group at TechTarget Inc., the parent company of SearchBusinessAnalytics.com.
He says Hadoop clusters provide a platform for storing unstructured and semistructured information "that really never fit neatly or cost-effectively into data warehousing environments." In addition, Hadoop's distributed file system enables users to start working with data more quickly than they typically can with information going into traditional enterprise data warehouses. "You don't have to model your data up front, or map it, transform it and load it," Eckerson says. "You really just dump it into the Hadoop file system and go."
That doesn't mean a Hadoop system will be an analytical speed demon. As a batch-processing environment, Hadoop isn't suited to doing "iterative, speed-of-thought querying," Eckerson says. But once companies start using Hadoop to manage sets of data, many find a variety of applications for the information, including uses they didn't initially anticipate. And that can provide rapid rewards, he says: "Oftentimes, the initial investment, whatever that is, ends up paying for itself quickly."
Viewers of the five-minute-long video will get Eckerson's take on the following:
- The definition of big data (0:15)
- How big data fits with existing BI and data warehousing environments (0:52)
- The benefits of using big data and technologies such as Hadoop (1:45)
- The challenges that organizations face in implementing Hadoop (2:55)
- Potential surprises to be aware of on Hadoop deployments (3:54)