Streaming analytics tools drive real-time decision-making from data. This significantly affects a company's business processes, customer experiences -- even revenues.
Streaming analytics platforms complement stream processing tools. The stream processing tool combines, processes and transforms data from raw feeds. The streaming analytics platform transforms that live data into insights, quickly and at scale.
"Streaming data has become core to the central nervous system of most enterprises with most transactions happening in real time," said Zakir Hussain, partner in the data practice at EY Americas, a consulting and services company.
When choosing streaming analytics platforms, consider the development team's familiarity with the tools used by a given platform, how the tool integrates data and its support for existing mission-critical data pipelines, Hussain said.
Examples of the types of data used for streaming analytics include location data, marketing and sales data, point of sale data, machine health data -- including logs, security information and event management data -- and retail or warehouse inventory data.
Tool selection committees should also investigate how well a given platform may work for a set of core use cases. They must explore the process of setting up analytics and the ability to connect to automated decision-making workflows.
"Use cases that utilize real-time data to make critical decisions are seeing the fastest adoption for streaming data analytics," Hussain said.
Some common streaming analytics use cases include surge pricing, remote patient monitoring, automating customer offers and incentives, predictive maintenance and fraud detection. It is also essential to consider how new tools work well together as part of a full implementation.
"A common approach to selecting streaming analytics tools is choosing components that are typically packaged together," said Bhrugu Pange, managing director who leads global management consultant AArete's digital and technology practice.
Examples include the following:
- Apache Kafka, KStreams, Confluent KSQLdb and Druid.
- Amazon Kinesis, Kinesis Data Analytics and Amazon Quicksight.
- Microsoft Azure Stream Analytics and Power BI.
- Databricks and Spark Streaming.
The following are 10 streaming analytics tools to consider. They were selected based on support for modern features, strength in the market and the ecosystem of supporting tools and capabilities.
Altair makes tools for computer-aided engineering, widely used in automotive, aerospace and consumer electronics. It has extended beyond design and simulation into other aspects of data management, including stream processing and streaming analytics.
Altair's streaming analytics capabilities are a good fit for physical infrastructure using IoT sensors, digital twins of fleets of products and industrial controls. Altair's streaming analytics tools suit organizations working on apps in finance, oil and gas, logistics and industrial automation.
Its streaming analytics offerings include Panopticon for data visualization and RapidMiner to design streaming analytics workflows. It runs on all major cloud platforms and containerized environments. It also supports low-code development.
Amazon Kinesis Data Analytics
Amazon Kinesis Data Analytics (KDA) is a cloud-native streaming analytics tool. It suits companies looking to integrate streaming data from across an AWS cloud estate. It provides a way to provision serverless instances of Apache Flink stream and batch processing that automatically scales and integrates with other Amazon apps and third-party services connected via the AWS ecosystem.
It has comprehensive data analytics development tools such as Apache Zeppelin notebooks and Kinesis Data Analytics Studio. Developers can also create analytics apps using the Apache Beam programming model to set up data processing pipelines. However, enterprises may want to consider other tools instead of KDA when they need to work with streaming data outside the Amazon cloud.
Cloudera has its roots in the big data processing space. It has extended its core strength in constructing large-scale data lakes with Cloudera DataFlow (CDF), a scalable, real-time data streaming processing and analytics platform.
Enterprises can craft streaming analytics apps that span private infrastructure and public cloud services. Developers can connect to any data source using Apache NiFi, an open source data routing and transformation layer. It supports serverless microservices that can run in AWS Lambda, Microsoft Azure Functions and Google Cloud Functions, so enterprises can scale apps across multiple cloud platforms.
Confluent was founded by the developers of the Apache Kafka data processing framework. Kafka is not a streaming analytics platform on its own. Confluent developed ksqlDB, a database to help developers create streaming analytics applications on top of Kafka.
Confluent's offerings use the latest enhancements in the open source platform. They provide ksqlDB as a standalone application that companies can deploy internally or as part of a managed service that comes with security and management tools.
Confluent ksqlDB fits companies that want to customize Kafka tools for specific streaming analytics applications while also accessing advances and security updates in the core Kafka platform.
Google Cloud Dataflow
Google Cloud Dataflow is a cloud-native managed streaming analytics tool. It can automatically scale analytics workflows with changes in data input or analytics requirements. Teams can set up the infrastructure using Apache Beam to configure processing pipelines.
Dataflow fits with Google's ecosystem of AI, machine learning and data processing tools. A library of prebuilt analytics and AI templates simplifies development of streaming analytics apps for personalization, predictive analytics and anomaly detection. A smart diagnostics feature helps to identify and visualize streaming analytics bottlenecks and recommend fixes.
IBM Streaming Analytics
IBM has integrated its streaming analytics capabilities into IBM Streaming Analytics for IBM Cloud. IBM provides a variety of accelerators to help apply streaming analytics to common problems.
The platform helps develop streaming analytics apps for finance, HR, IT, marketing and supply chain use cases. It includes various tools for working with unstructured data, text, video and IoT sensor data. It can also help set up predictive and prescriptive analytics workflows, which are typically more challenging than diagnostic analytics popular with BI use cases. These tools can improve financial and operational modeling, sales planning and workforce performance planning.
Microsoft Azure Stream Analytics
It can provision analytics apps that can recover when faults emerge. The platform also works with Microsoft's other AI development tools for developers to integrate machine learning models into streaming analytics workflows.
It suits companies that have standardized on Azure infrastructure, particularly those using Azure IoT Edge.
SAS Event Stream Processing
The SAS Event Stream Processing tool works with the company's statistics, analytics, AI and machine learning tools. Users can stream data from operations, transaction processing and IoT sensors into various analytics workflows.
SAS Event Stream Processing can take advantage of various tools to clean, connect, correlate and transform data in motion. It also helps set up predefined components to filter, normalize and standardize data sets as they are ingested into real-time analytics apps, and then store the appropriate subset for subsequent analysis. Developers and data scientists can create new data models using low-code tools.
Companies can use this tool to develop streaming analytics apps that require complex data transformation, particularly when they already have an established SAS practice.
Software AG Apama Streaming Analytics
Software AG acquired Apama in 2013, forming the core of its Apama Streaming Analytics platform. The product integrates into Software AG's Cumulocity IoT platform. It suits various IoT, industrial automation and logistics workflows.
Nick KramerLeader of applied solutions, SSA & Company
In addition, Apama can handle high-volume applications in algorithmic trading and fraud detection. It supports building complex real-time analytics processing pipelines that consist of thousands of individual processing monitors. Other capabilities let users replay streams to identify problems or simulate various scenarios in predictive and prescriptive analytics applications.
Organizations working with IoT data or needing to scale complex data processing and analytics pipelines should evaluate Apama.
Tibco Streaming is part of its Hyperconverged Analytics Solutions, a set of tools that combines data science, stream processing and visual analytics.
Developers can create analytics apps using Tibco Spotfire for data analysis, visualization and model creation. Tibco Streaming can also work with Tibco's decision processing tools that help define complex decision models, take actions and evolve to make better decisions over time.
Tibco can help enterprises looking to simplify integration between data processing and automated decision-making at scale.
Consider technical and business needs
Stream processing tools increase performance and scalability, improving an organization's ability to handle large volumes of data in real time. Cloud support enables scale, with data processing tasks run in parallel.
Some of the technical features to consider include API integration, advanced self-service capabilities, custom alerts and collaboration features, said Bharath Thota, partner in the advanced analytics practice of Kearney, a global strategy and management consulting firm. Regulated industries also need to ensure support for role-based data access and activity logs, to ensure compliance.
There's more than just technical capabilities to consider when selecting a tool. Too many companies focus on platform cost and technical criteria rather than the factors that fundamentally determine ROI and value creation, said Nick Kramer, leader of applied solutions at SSA & Company, a global consulting firm.
Successful streaming analytics investments require clear goals with quantifiable outcomes of what the product should accomplish.
"Business ownership is crucial because then we start to use different criteria, such as prebuilt models and user experience, instead of speed and scintillating performance," Kramer said.
Business alignment enables a more detailed and nuanced set of requirements to evaluate tools, and improves vendor responses to inquiries about their products. This kind of analysis may also point enterprises away from some of the generic tools listed here to comprehensive systems that solve specific streaming analytics problems more effectively and efficiently.
Prebuilt systems focusing on consumer-packaged goods services and digital twins for manufacturing are rising. These options can reduce the complexity and time spent deploying the solution. It also means organizations are no longer constrained by their ability to hire data scientists and modern data engineers.
"This explosion of choices makes choosing the right fit harder, but it also makes it possible to get it just right," Kramer said.