Technologies like Spark, Kafka and Flink are making real-time analytics on streaming data more feasible. Enterprises are finding a variety of creative ways to draw insights by combining streaming data with other sources.
"Real-time streaming analytics makes it easier to determine what's working and what isn't working, at a faster pace," said Rishi Sood, engineering manager on the local services team at Trulia, a real estate service. High-performance teams base their success on various metrics that are driven by any number of variables such as features, a marketing campaign, product reviews and up-to-date data. This success comes down to understanding real-time data and making sense of everything that is changing.
"Having a real-time analysis pipeline allows for making sense of all of the changes in real time, which, ultimately, allows for better management of a company's product and user base," Sood said.
According to Mark Palmer, senior vice-president and general manager of analytics at Tibco Software Inc., "the adoption of streaming analytics is on the rise because the accessibility of data in motion has reached a tipping point." It's commonplace for an enterprise to have access to constantly changing data, which includes sales leads, transactions, mobile apps, customer service calls, kiosks, social media activity, customer orders, chat messages and supply chain updates.
This article is part of
"Streaming analytics are designed to process data in motion, rather than only data at rest, so it's better suited for digital information that flows from connected things, connected people and connected systems," Palmer said.
Rishi SoodEngineering manager, Trulia
Recent advancements in streaming BI help democratize access to real-time data and easy-to-use UIs to easily tap any data attribute or event trigger and capture business decisions to automate. Now, instead of requiring an army of developers to provide access to real-time data, business analysts are empowered to do it themselves with as much effort as loading an Excel spreadsheet. Here are some examples of how enterprises are tapping into real-time streaming analytics.
1. Fine-tune app features
Trulia is constantly pushing out features and then acting on real-time streaming data to understand their adoption and ensure success. If a feature isn't getting enough traffic, a real-time streaming pipeline might message the marketing department to kick off an initiative to drive more traffic immediately.
Once results are seen, marketing could decide to either pause the initiative or double down and continue. Ultimately, this adds a level of sophistication to new rollouts and core feature monitoring, and supports data-driven decisions, Sood said. Real-time streaming analytics can also help with anomaly detection or even predictive analytics, which can be used to improve consumer experiences.
2. Manage location data
Another use case at Trulia is the maintenance and processing of location-based data, such as boundaries and centroids that reflect the shape of areas. The vast majority of data Trulia provides to users is location-specific. When the company needs to update a city boundary, change a ZIP code or make county-line adjustments, this needs to be reflected in other analytics delivered to customers and managers.
All the data tied to that location also must change in order to provide consistent data accuracy. "This is where real-time analytics can determine affected data sets and signal the appropriate updates to those systems, so that accurate data is provided to our consumers," Sood said. Other analytics tools might take days to effect these changes.
3. Conduct real-time personalization
"The battle for user attention is fiercer than ever, and there is a measurable advantage to be gained by providing relevant and personalized experiences," said Fabian Hueske, co-founder of Ververica (Formerly Data Artisans) and a committer and project management committee member of the Apache Flink project. Building real-time personalization experiences can be a challenging task for different enterprises, and streaming analytics can help immensely.
As an example, Alibaba, which recently acquired Ververica, is using Apache Flink to improve its personalization for millions of products across its e-commerce platform. Alibaba is the largest e-commerce retailer in the world, with millions of different customers searching for millions of products on the company's websites and portals. Real-time streaming analytics helps to improve search relevance on its properties. With a changing product catalog, both in terms of price and availability, personalization has a meaningful impact on sales, Hueske said.
Alibaba uses Apache Flink to maintain both its incrementally and fully updated product catalogs, ensuring that changes in price or availability are reflected in search rankings as quickly as possible. It also uses Flink to continuously train machine learning models in real-time, resulting in a search platform that factors in both an ever-changing product catalog as well as user preferences based on current and past behavior.
4. Detect anomalies and frauds in real time
Enterprises must be able to identify outliers such as security breaches, network outages or machine failures in real time. In financial services, companies can respond to potential fraudulent sign-in attempts or credit card transactions by joining a real-time activity stream with historic account usage data in real-time, Hueske said.
As an example, Mux, a monitoring and metrics company for streaming video, built an anomaly detection system to monitor playback quality and detect and respond to error spikes. Similarly, Microsoft built an anomaly detection engine that detects real-time, malicious activity in the cloud, such as compromised accounts, insider threats and ransomware.
5. Provide healthcare, emergency and humanitarian services
Wearable health devices, such as watches, are reported to have already saved lives through electrocardiogram tests, alerting users to the possibility that they might be experiencing atrial fibrillation. This could be combined with data about the movement of elderly people in their homes to proactively notify relatives or caregivers when required. By porting a powerful analytical engine to edge devices, such as bikes or drones, this can become very powerful in combination with computer vision, traffic, weather and geospatial data that can be analyzed to provide recommendations to incident commanders, said Dan Sommer, senior director and global market intelligence lead at Qlik, an analytics tool provider.
6. Move analytics to the edge
There are many use cases requiring real-time analytics in the industrial and commercial IoT sectors, such as manufacturing, oil and gas, transportation, smart cities and smart buildings. Potential use cases range from predicting the failure conditions in a manufacturing process based on the live sensor data to reducing scrap and improving the yields to detecting anomalies in an oil rig drilling station.
Sastry Malladi, CTO of FogHorn Systems, an IoT platform provider, said, "There are literally thousands of use cases in these sectors where real-time decisions are needed for better operational efficiencies and yields, but [companies] can't afford either money or time to send all the raw data to a cloud or data center environment for batch processing." Real-time streaming analytics at the edge avoids connectivity requirements, bandwidth costs for transmitting unprocessed raw data, cybersecurity attacks on valuable machinery, and latencies potentially making the operator miss an event resulting in huge losses.
7. Empower advertising and marketing campaigns
Streaming analytics can help manage multiple sources of data for improving advertising and marketing campaigns, Tibco's Palmer said. Data sources can include ad inventory, web traffic, click logs and both customer demographic and behavioral data. This data can be analyzed together to uncover insights that improve audience targeting, pricing strategies and conversion rates, increase campaign ROI and create new revenue opportunities.