Apache Spark News
August 10, 2017
Information Builders has released a free developer version of its big data integration platform, which could potentially help channel companies take on Hadoop projects.
June 06, 2017
Databricks brings new features to its managed Spark platform -- as well as to open source Spark -- that it hopes will make the computing engine more widely usable.
May 12, 2017
Kafka is a linchpin in many on-premises big data pipelines. Now, software vendor Confluent is offering a Kafka cloud service to ease use of the messaging and data streaming system in the cloud.
April 20, 2017
Corporate users are becoming more open to deploying big data systems with Apache Spark in the cloud, Databricks CEO Ali Ghodsi says in a Q&A on the open source processing platform.
Apache Spark Get Started
Bring yourself up to speed with our introductory content
The desire to accelerate operational decision-making processes is leading organizations looking for a competitive edge to deploy streaming analytics platforms fed by real-time data. Continue Reading
Big data architectures typically involve multiple processing platforms. In this essential guide, you'll find information and advice on managing Hadoop, Spark and other big data technologies. Continue Reading
Amazon Elastic MapReduce helps our team process streaming data, but we've run into a number of issues. How can we identify and correct problems with these workloads? Continue Reading
Evaluate Apache Spark Vendors & Products
Weigh the pros and cons of technologies, products and projects you are considering.
It's not too late to consider signing up for data management conferences in June. Here's a quick rundown of four events focused on Hadoop, Spark, data governance and other topics. Continue Reading
In this Talking Data podcast, Spark users are finding that latency and development challenges can make it difficult to start doing machine learning with Spark systems. Continue Reading
Cloud had a big impact on big data management and analytics last year. Machine learning and streaming designs will contribute to change in 2017. Continue Reading
Manage Apache Spark
Learn to apply best practices and optimize your operations.
Organizations with big data environments are starting to prepare data for analysis before making it available to data scientists and other users, instead of leaving the work to them. Continue Reading
Processing in big data systems can slow to a crawl if queries are not properly tuned or workloads not well balanced -- issues that call for careful monitoring of clusters. Continue Reading
New IBM machine learning capabilities let data center teams pull insights from their z/OS mainframes. But concerns about data management and cost will likely arise. Continue Reading
Problem Solve Apache Spark Issues
We’ve gathered up expert advice and tips from professionals like you so that the answers you need are always available.
Data center networking is no longer just a maze of physical cables; it's a tangled web of overlays and firewall rules. Database management is more than ensuring you have enough capacity as your company collects increasing volumes of data and expects real-time analysis.
Yet users demand simplicity; they expect the underlying infrastructure to be invisible. Executives want IT to function like a utility. When they turn on the tap, they don't care about the plumbing required to deliver the water; they simply want it to work. This is the tension threatening to plunge IT shops into chaos -- to build and support ever more complex data center infrastructure while making it appear effortless.
Nowhere is this tension more clear than the growing demand to store and digest big data. However, it's not just about big data networking today. It's about doing something with that data -- and doing it now. Curiously, technologies that once aimed to streamline operations have sometimes led to more complexity. Networking overlays, for example, have given operators the ability to steer traffic and create logical resource pools, but they also come with additional management overhead.
All these topics and much more in this month's Modern Infrastructure.Continue Reading
The challenges encountered in deriving business benefits from big data are huge, but so are the rewards. Hadoop and related technologies are easing those challenges to the point where companies are willing to graduate from experimental to full-blown big data analytics deployments. Still, the march toward that goal can be long and arduous, and not just from a technological and architectural standpoint. Before taking the plunge, big data users, including data scientists, managers and evangelists, are faced with the sometimes monumental task of justifying big data's return on investment to business executives focused on competition, profit margins and allocation of funds. "For a lot of organizations like ours, big data has not yet become a core foundation of running the business," said Beata Puncevic, director of analytics, data engineering and data management at Blue Cross Blue Shield of Michigan. Yet, actionable insights gained from big data analytics can be indispensable in driving revenue, reducing costs and developing new products.
This handbook on big data analytics examines the trials and tribulations of big data users who are on the front lines, devising and implementing partial and full-blown applications. In the first feature, editor Craig Stedman interviews battle-tested IT and analytics warriors from Blue Cross, Macy's and Progressive Insurance who reveal the business challenges in justifying the worthiness of big data applications. In the second feature, Stedman explains how real-time big data analytics is helping companies like Comcast and eBay to move quickly on massive amounts of incoming information. And in the third feature, reporter Ed Burns spotlights the decisions at Neilsen and Nasdaq to run or not to run big data systems in the cloud.Continue Reading
This Essential Guide explores enterprise data analytics strategies and how to select the right infrastructure, management tactics and technologies for your organization. Continue Reading