Gorodenkoff - stock.adobe.com

Databricks launches PostgreSQL Lakebase to aid AI developers

Resulting from the $1B acquisition of Neon, the database built for AI workloads -- including separate compute and storage -- is now integrated with the vendor's broader platform.

Databricks Lakebase is now generally available, eight months after the PostgreSQL database purposed for AI development was first unveiled in public preview. 

Lakebase, which was launched on AWS on Feb. 3, is the result of Databricks' $1 billion acquisition of Neon, a cloud-based database vendor providing a platform built on the open-source PostgreSQL format, in May 2025. Databricks has since rebranded Neon's capabilities, and now Databricks has integrated them with its Data Intelligence Platform to provide customers with an operational database in conjunction with its data lakehouse. 

Beyond integration with Databricks' broader platform, Lakebase fosters AI development by separating compute from storage, unlike many PostgreSQL databases that couple them together. By separating the processing power for queries from the power needed for storage, Lakebase eliminates competition between the two for memory resources and the resulting resource management tasks that can slow development initiatives. 

In addition, Lakebase features autoscaling to help users control the cost of building agents and other AI applications, and unified governance through Databricks' Unity Catalog, among other capabilities. 

Given that Lakebase better integrates PostgreSQL workloads with the broader Databricks platform, it is a significant addition for the vendor's customers, according to Devin Pratt, an analyst at IDC. 

"The opportunity is to reduce friction between operational and analytical data so real-time applications and AI agents can work from governed data that stays current, with less ETL and duplication," he said. 

William McKnight, president of McKnight Consulting, similarly noted that Lakebase's value lies in its integration with other Databricks capabilities, reducing the need for data egress pipelines between the database and other tools. 

"This architectural shift minimizes fragile pipelines by co-locating transactional workloads with heavy analytics under a single governance model," he said. "It effectively removes the 'architectural tax' that has historically separated live apps from data lakes." 

Prowess of PostgreSQL

Based in San Francisco and one of the pioneers of the data lakehouse architecture for storing data, Databricks, like many data management vendors, has added AI development capabilities over the past few years in response to rising customer interest in building AI tools that call on an enterprise's proprietary data to understand its unique operations. 

The opportunity is to reduce friction between operational and analytical data so real-time applications and AI agents can work from governed data that stays current, with less ETL and duplication.
Devin Pratt Analyst, IDC

Because PostgreSQL databases are more flexible than many other databases, PostgreSQL is now the most popular database format, according to the 2024 Developer Survey by Stack Overflow. 

Versatility -- handling geospatial, time series, JSON and vector database workloads -- and flexibility are two of the main reasons PostgreSQL databases are now more popular than fellow open-source MySQL databases and databases provided by vendors such as Microsoft, MongoDB and Redis. 

With PostgreSQL so popular, and its adaptability enabling users to run workloads that aid AI development, hyperscale cloud vendors AWS, Google, IBM, Microsoft and Oracle all offer PostgreSQL databases that can be used with their AI development tools. Now, more specialized data management vendors are doing the same. 

Three weeks after Databricks acquired Neon, rival Snowflake purchased Crunchy Data to add a PostgreSQL database. Then in October 2025, Redpanda acquired Oxla to likewise add a PostgreSQL database. 

"PostgreSQL has evolved into the great consolidator of the modern data stack by transforming from a traditional relational database into a unified, multi-model engine capable of powering the agentic AI era," McKnight said. "By natively integrating vector search with structured business data, it eliminates the need for fragmented point solutions, reducing development complexity." 

In addition, pricing is a factor in PostgreSQL's growing popularity, McKnight continued, noting that PostgreSQL databases often cost less than databases from hyperscale cloud vendors.  

"As enterprises pivot toward Sovereign AI to maintain data gravity and avoid public cloud lock-in, PostgreSQL has become the strategic foundation for organizations that want a secure, high-performance platform to manage the transactions and vectors required for modern AI at scale," he said. 

Although PostgreSQL databases are gaining popularity as more enterprises invest in AI development, Databricks' Lakebase and Snowflake Postgres are differentiated from standalone PostgreSQL databases by their integration with broader data management and AI development platforms, according to Pratt. 

Both reduce the need to move data between systems, which can increase development costs and potentially expose data to breaches, and both enable hybrid transactional and analytical workflows that are relevant for AI and real-time analytics workloads. 

But whether one proves more effective than the other remains to be seen. 

"Both are pushing PostgreSQL closer to analytics and AI, and the real differences will come down to platform integration and day-to-day operational experience," Pratt said. 

In addition to separation of compute and storage, key features of Lakebase include the following: 

  • Serverless autoscaling that automatically adjusts compute resources to match workload demands, including shutting off when no workloads are running to eliminate wasted spending. 

  • Unified governance through the Databricks Unity Catalog, enabling users to manage and secure data across their entire data estate. 

  • Instant database branching so users can quickly create isolated clones of production data to conduct risk-free testing and development work. 

  • Point-in-time recovery, a feature that protects against accidental deletions or bugs. 

  • Sync tables to automatically synchronize operational data and historical lakehouse context without having to build and manage complex pipelines. 

Collectively, the features that comprise Lakebase are designed to let users run governed, secure operational data workloads directly on Databricks without having to configure connections between their PostgreSQL database and AI development pipeline or move data between systems, according to a Databricks spokesperson. 

Meanwhile, instant database branching stands out as perhaps Lakebase's most significant feature, according to Pratt. 

"Instant branching improves developer productivity by making it easier to test on production-like data without putting production systems at risk," he said. 

McKnight, however, highlighted decoupled compute and storage. 

"This fundamental shift directly addresses the long-standing 'architectural bottleneck' by facilitating serverless autoscaling and limiting resource contention between demanding analytical workloads and live operational applications," he said.  

Looking ahead

With Lakebase now generally available, one of Databricks' focal points is to make it easy to operate a large number of databases at scale, according to the spokesperson. 

Ease-of-use is a wise focus for Databricks, according to McKnight. 

Databricks has historically appealed to technical experts while rival Snowflake has targeted business users. To broaden its appeal, McKnight advised Databricks to improve Databricks Serverless, a fully managed service that removes infrastructure management tasks, and its Databricks One user interface. 

"By evolving its Serverless and Databricks One initiatives into a true zero-administration environment, Databricks can appeal to business analysts who want the architectural efficiency of a lakehouse without the traditional engineering overhead," he said. 

An additional area of focus could be cost control, McKnight continued. 

"To neutralize Snowflake, Databricks must … prove that it can provide a lower total cost of ownership while bridging the AI return on investment gap with production-ready, operational templates," he said. 

Pratt, meanwhile, suggested that Databricks expand efforts to converge operational and analytical workloads to fuel AI initiatives, including providing practical guidance and reference architectures that help customers move from pilots to enterprise-wide production. 

"The next chapter is adoption, helping customers turn convergence into production applications that deliver real-time decisions," he said. 

Eric Avidon is a senior news writer for Informa TechTarget and a journalist with more than three decades of experience. He covers analytics and data management. 

Dig Deeper on Data management strategies