Apache Spark News
May 12, 2017
Kafka is a linchpin in many on-premises data pipelines. Now, Confluent has a Kafka cloud service to ease this distributed system's ascent to cloud nirvana.
April 20, 2017
Corporate users are becoming more open to deploying big data systems with Apache Spark in the cloud, Databricks CEO Ali Ghodsi says in a Q&A on the open source processing platform.
February 21, 2017
Moving custom Spark and Hadoop pilot projects into production use has proved daunting. But container technology eased the transition at the Advisory Board analytics service.
February 20, 2017
Alfred Essa from McGraw-Hill and Mike Gualtieri from Forrester on serving the customer of one at Spark Summit East.
Apache Spark Get Started
Bring yourself up to speed with our introductory content
The desire to accelerate operational decision-making processes is leading organizations looking for a competitive edge to deploy streaming analytics platforms fed by real-time data. Continue Reading
Big data architectures typically involve multiple processing platforms. In this essential guide, you'll find information and advice on managing Hadoop, Spark and other big data technologies. Continue Reading
Amazon Elastic MapReduce helps our team process streaming data, but we've run into a number of issues. How can we identify and correct problems with these workloads? Continue Reading
Evaluate Apache Spark Vendors & Products
Weigh the pros and cons of technologies, products and projects you are considering.
In this Talking Data podcast, Spark users are finding that latency and development challenges can make it difficult to start doing machine learning with Spark systems. Continue Reading
Cloud had a big impact on big data management and analytics last year. Machine learning and streaming designs will contribute to change in 2017. Continue Reading
Mesosphere DC/OS provides organizations of all sizes with large-scale container orchestration and microservices management tools for a variety of container formats. Continue Reading
Manage Apache Spark
Learn to apply best practices and optimize your operations.
New IBM machine learning capabilities let data center teams pull insights from their z/OS mainframes. But concerns about data management and cost will likely arise. Continue Reading
Big data platforms like Apache Spark process massive volumes of data faster than other options. As data volumes grow, enterprises seek ways to speed up Spark. Continue Reading
The big data ecosystem has many twists and turns. A McKesson data manager saw Splice Machine's database as a means to straighten the path by putting analytics and operations data in one place. Continue Reading
Problem Solve Apache Spark Issues
We’ve gathered up expert advice and tips from professionals like you so that the answers you need are always available.
Data center networking is no longer just a maze of physical cables; it's a tangled web of overlays and firewall rules. Database management is more than ensuring you have enough capacity as your company collects increasing volumes of data and expects real-time analysis.
Yet users demand simplicity; they expect the underlying infrastructure to be invisible. Executives want IT to function like a utility. When they turn on the tap, they don't care about the plumbing required to deliver the water; they simply want it to work. This is the tension threatening to plunge IT shops into chaos -- to build and support ever more complex data center infrastructure while making it appear effortless.
Nowhere is this tension more clear than the growing demand to store and digest big data. However, it's not just about big data networking today. It's about doing something with that data -- and doing it now. Curiously, technologies that once aimed to streamline operations have sometimes led to more complexity. Networking overlays, for example, have given operators the ability to steer traffic and create logical resource pools, but they also come with additional management overhead.
All these topics and much more in this month's Modern Infrastructure.Continue Reading
The challenges encountered in deriving business benefits from big data are huge, but so are the rewards. Hadoop and related technologies are easing those challenges to the point where companies are willing to graduate from experimental to full-blown big data analytics deployments. Still, the march toward that goal can be long and arduous, and not just from a technological and architectural standpoint. Before taking the plunge, big data users, including data scientists, managers and evangelists, are faced with the sometimes monumental task of justifying big data's return on investment to business executives focused on competition, profit margins and allocation of funds. "For a lot of organizations like ours, big data has not yet become a core foundation of running the business," said Beata Puncevic, director of analytics, data engineering and data management at Blue Cross Blue Shield of Michigan. Yet, actionable insights gained from big data analytics can be indispensable in driving revenue, reducing costs and developing new products.
This handbook on big data analytics examines the trials and tribulations of big data users who are on the front lines, devising and implementing partial and full-blown applications. In the first feature, editor Craig Stedman interviews battle-tested IT and analytics warriors from Blue Cross, Macy's and Progressive Insurance who reveal the business challenges in justifying the worthiness of big data applications. In the second feature, Stedman explains how real-time big data analytics is helping companies like Comcast and eBay to move quickly on massive amounts of incoming information. And in the third feature, reporter Ed Burns spotlights the decisions at Neilsen and Nasdaq to run or not to run big data systems in the cloud.Continue Reading
This Essential Guide explores enterprise data analytics strategies and how to select the right infrastructure, management tactics and technologies for your organization. Continue Reading