Petya Petrova - Fotolia
Designing and building data pipelines for business intelligence and analytics can sometimes be challenging. Intelligent business automation startup Ascend Software is looking to make that task a bit easier with its new Queryable Dataflows capability.
Queryable Dataflows builds on top of Ascend's Autonomous Dataflow Service, which helps developers building data pipelines intuitively.
With Queryable Dataflows, which Ascend put into technical preview Aug. 27, rather than just building data pipelines, data engineers can also search and make queries on data pipelines, without needing to first have data stores in a data warehouse.
It's a capability that John Santaferraro, research director at Enterprise Management Associates (EMA), said he sees as having some real potential. Digital transformation has created a big opportunity for real-time analytics and the value of a quick, intelligent response continues to increase as the digital economy grows, he said.
"Engagement data from mobile, IoT and internet is perishable," Santaferraro said. "Because of its time-sensitive nature, the ability to ask questions of data within a dataflow enables organizations to quickly make decisions on next best action, next best offer or other insight-driven responses in real time."
Why building data pipelines matters
Data pipelines are an important element of the modern data economy and are an evolution of the broader data integration market, which has become fragmented.
The market started with server-based extract, transform and load (ETL), moved on to the API revolution and finally underwent massive change in big data and cloud waves, Santaferraro explained. In his view, the proliferation of data in motion has created new challenges that legacy technology cannot address. The result is an explosion of data pipelines flowing in all directions, in and out of platforms, applications and machines.
"To address the requirements of digital everything, new vendors, like Ascend.io, are utilizing artificial intelligence [AI] and machine learning [ML] to create an autonomous service in which users can build, scale, operate, automate and optimize data pipelines," Santaferraro said. "EMA believes that the use of AI and ML in data integration, preparation and pipeline management will finally bring the scattered market together and overcome the complexity that was created over the last 20 years."
A "data pipeline world"
Sean Knapp, CEO and founder of Ascend, based in Brea, Calif., said he started the company after realizing that businesses that had the greatest competitive advantage were those that were able to build advanced data pipelines.
Sean KnappCEO and founder, Ascend
"We believe that it is a data pipeline world," Knapp said. "The thing that I think creates the opportunity for us and to frankly, not only exist, but to thrive, is the fact that actual creation, maintenance and operation of data pipelines is insanely difficult."
The Autonomous Dataflow Service is Ascend's answer to the increasingly complex challenge of building data pipelines, providing automation for the more difficult aspects of data pipeline design and operation. Knapp explained that the Autonomous Dataflow Service takes a declarative approach, in which developers define what they want to happen, rather than needing to define each individual step of a data pipeline process.
"You can just define and architect what your data pipeline should look like and that really leverages what we call our data flow control plane," Knapp explained. "That's the smarts of our system that really cuts out huge portions of code and complexity for you as a data engineer."
The new Queryable Dataflows capability is an attempt to solve a problem that Knapp said Ascend was seeing in the field. That is that as engineers are building data pipelines they are constantly flipping between exploratory development and production stages to see what will work.
Knapp said that the Autonomous Dataflow Service has the capability to persist data and in so doing is actually able to make at any stage of any data transform, look and feel as if it were a data warehouse table. With the Queryable Dataflow capability, users can now ad hoc query any stage of the data pipeline.
Ascend intends the query capability to help accelerate development. He emphasized, however, that Ascend is not positioning itself to replace data warehouses or purpose-built business intelligence tools.
With Queryable Dataflows, Knapp said, a certain amount of processing that might have previously needed to occur in a data warehouse can potentially be offloaded, making the entire data pipeline and analysis process more efficient.
Ascend is making available Queryable Dataflows as a technology preview with general availability set for later this quarter. Looking forward, Knapp said that Ascend will continue to work on making the capability easier to work with when building data pipelines.
"A really big theme that we're working on is dataflows and data pipelines as code and the notion of DataOps, not as a buzzword, but as the true sort of DevOps equivalent for data," Knapp said. "It's about how we can help to facilitate a far more efficient data development lifecycle."