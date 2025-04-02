Databricks on Wednesday launched Lakeflow Connect with the general availability of connectors for Salesforce and Workday.

Lakeflow Connect is a set of low-code/no-code connectors between the Databricks Data Intelligence Platform and SaaS applications, databases and other file sources first unveiled in July. Together with Delta Live Tables (DLT) for data transformation and Databricks Workflows for data orchestration, it makes up the Lakeflow engineering suite.

Lakeflow Connect is powered by serverless compute, which enables users to run workflows without having to provision clusters -- Databricks manages and scales the requisite compute power. In addition, it integrates with Databricks' governance, observability and security capabilities, including Unity Catalog.

Kevin Petrie, an analyst at BARC U.S., noted that BARC research shows more than 90% of AI leaders are at least testing the use of structured data to inform applications. Nearly two-thirds are using real-time data to train applications, according to the research.

As a result, Lakeflow Connect is a significant addition, according to Petrie.

"Salesforce and Workday applications provide exactly this type of data as inputs for real-time machine learning and GenAI use cases," he said. "Databricks is right to simplify data access in this fashion."

Based in San Francisco, Databricks is a data platform vendor that helped pioneer the lakehouse format for storing data. Like many other data management vendors, Databricks has expanded into AI development during the past two-plus years.

Connecting to data Data ingestion is critical but complex. It's simply the process of obtaining and importing data into systems such as databases, data warehouses, data lakes and data lakehouses. But building and maintaining pipelines that move data from the systems where it's created -- such as Salesforce and Workday -- into systems where it's stored and prepared for informing analysis is complicated. It often involves developing an infrastructure that includes data extraction tools, streaming data platforms such as Apache Kafka and change data capture (CDC), among other capabilities. The result is that engineers spend substantial time piecing together and maintaining disparate tools, some of which eventually fail when the scale exceeds their capabilities, with both the time and technology purchases adding up to a significant expense. Databricks heard from customers about the trouble they were having ingesting data, and that feedback provided the impetus for developing Lakeflow Connect, according to Michael Armbrust, a distinguished software engineer at Databricks. The vendor provided connectors to numerous data sources before Lakeflow Connect, but they had to be configured by customers and maintained as the APIs, schemas and other aspects of data sources changed. In October 2023, Databricks acquired Arcion for $100 million to add improved data ingestion capabilities. Lakeflow Connect represents Databricks' integration of Arcion with its Data Intelligence Platform. "Customers need this data, but before this announcement they were forced to use third-party tools that oftentimes at large scale would fall over, so they would have to build their own custom solutions," Armbrust said. "This makes [ingestion] point-and-click within Databricks." Using Lakeflow Connect's first two connectors, engineers can create data ingestion pipelines with either a few clicks or a few lines of code so that data created in Salesforce and Workday can be quickly and easily extracted and moved into the Data Intelligence Platform. In addition, because the connectors integrate with the Data Intelligence Platform, once in the Databricks environment, data governance developed in the Unity Catalog is automatically applied to Salesforce and Workday data as it's ingested. Donald Farmer, founder and principal of TreeHive Strategy, noted that many other vendors provide connectors to data sources to simplify data ingestion. For example, Qlik provides Connector Factory for its customers. However, Lakeflow Connect is nevertheless valuable for Databricks users, demonstrating progress on the part of the vendor and representing a "milestone." In particular, its integration with Unity Catalog and CDC capabilities are notable, according to Farmer. "It's difficult to say that Lakeflow Connect is unique, but the integration with Unity Catalog and the CDC, which they acquired from Arcion, are useful elements," he said. In addition, Farmer highlighted serverless compute as an important aspect of Lakeflow Connect. "The serverless compute may be quietly important, not just for its seamless scalability but for the rapid startup times, which are important in reducing latency when running many complex pipelines," he said. Beyond simplifying data ingestion, Lakeflow Connect is designed to make it easier for data engineers to transform and orchestrate data within the Lakeflow engineering environment to prepare it for analysis and AI development. In conjunction with DLT and Databricks Workflows Lakeflow Connect helps provide engineers with unified data preparation environment.