Databricks unveiled new large language model and GPU optimization capabilities in Model Serving in a move designed to enable customers to improve generative AI outcomes.
Databricks launched Model Serving in March. Model Serving is a service that enables Databricks customers to deploy AI and machine learning (ML) models as REST APIs to a single environment for model management, at which point Databricks takes over the management, including refreshing the model with updated data and fixing any bugs.
Before Model Serving, users often had to manage complex AI and ML infrastructures that required them to use batch files to move data into a cache in a data warehouse. There, users could train a model before moving it to another application where the model could ultimately be consumed for analysis. Refreshing models with updated data and tweaking models to fix problems also required copious amounts of work.
REST APIs, however, let users train and deploy models directly on the Databricks Lakehouse Platform, eliminating the need to manage complex infrastructures made up of multiple tools.
In addition, the Model Serving environment comes with pre-built integrations with Databricks tools -- including MLflow Model Registry for deployment, Unity Catalog for governance and vector search for accuracy -- that help customers manage their AI and ML models.
On Sept. 28, Model Serving was updated to include Optimized LLM Serving, a tool that enables users to deploy privately developed generative AI models on the service as well as traditional AI and machine learning models. In addition, new GPU optimization capabilities in Model Serving aim to provide the requisite power for running and managing large generative AI models.
Both are in public preview.
The initial promise of generative AI is increased efficiency.
Natural language processing (NLP) capabilities that enable users to interact with data without having to write code can help data experts work more quickly. NLP can also enable more business users to work with data by lowering barriers to entry, such as the need to know code and data literacy expertise.
It can also allow for increased automation of repetitive processes and certain customer interactions.
Integrating with public large language models (LLMs) such as ChatGPT and Google Bard to train generative AI models, however, can be risky for organizations that want to keep their data private. When organizations push data out into those models to build and train generative AI models, they risk their data getting exposed.
Even when they import public LLM data into their own environment rather than push data out to the LLM, they risk data leaks by connecting to the public LLM. There are security measures organizations can take to attempt to ensure they can safely import LLM technology without exposing their own data, but those attempts are not foolproof.
In addition, generative AI models trained on public data don't always deliver accurate results. LLMs are trained to fill in gaps -- essentially, to make things up -- when they don't have the data to answer a question. Sometimes those made-up answers, called AI hallucinations, seem plausible. And that can lead to serious consequences for organizations basing key decisions on model outcomes.
As a result, many organizations are now developing their own language models by using technology from generative AI vendors, but training the models using their own domain-specific data.
Doug HenschenAnalyst, Constellation Research
Optimized LLM Serving aims to help Databricks customers easily deploy those privately trained generative AI models as well as optimize their performance.
According to Databricks, users simply have to provide the model and the open source or other structure used in its development, and Optimized LLM Serving will take over its management from there. The intended results include saving customers the time it takes to improve a model's performance and reducing the cost of managing a generative AI model by eliminating manual workloads.
Saving time and effort is significant because it enables customers to target end results, according to Doug Henschen, an analyst at Constellation Research.
"Databricks is simplifying matters for customers seeking to develop and deploy generative AI capabilities by eliminating the complexities of infrastructure selection and deployment and model optimization," he said. "This helps customers focus on the business use case instead of decisions around underlying technology."
Databricks' management of generative AI models, meanwhile, is enabled by GPU optimization.
Initially designed to process images and visual data, GPUs can also be used to speed up computational processes that are too much for traditional CPUs. In the case of Model Serving, GPUs provide the compute power for managing generative AI models as a service, according to Prem Prakash, Databricks' principal product marketing manager of AI and machine learning.
Customers simply have to log their model with MLflow, at which point Databricks will take over the model's management. The vendor's platform will automatically prepare a container with GPU libraries and then deploy that container to serverless GPUs where the model will be managed.
"LLMs are much more complex and compute-intensive than a document with words on it," Prakash said. "Trying to run it on a CPU could break [the CPU]. That's where GPUs come in."
Databricks' impetus for adding LLM hosting capabilities to Model Serving, meanwhile, was partially driven by customers looking to ease not only the burdens, but also the expenses of managing language models, Prakash continued.
He noted that by optimizing GPUs and automating management of LLMs, Databricks is able to provide management as a service at a lower cost than what organizations would otherwise pay to manage models on their own.
"Once they build [the models] in their own environment, they don't want to do all the work of managing GPUs," Prakash said. "These models are so big that they can be expensive to manage, so they asked if there was something we could do to make [management] more cost-effective."
Databricks is far from the only data management vendor to prioritize generative AI over the past year. For example, rival Snowflake is building an environment for developers to build generative AI applications and in May acquired Neeva to add generative AI capabilities.
But because Databricks was one of the pioneers of the lakehouse architecture that could be most optimal for generative AI model development and has added other features aimed at helping users build and deploy generative AI models, the vendor has been able to quickly develop tools such as Model Serving and now its enhancements.
As a result, its generative AI enablement capabilities are among the most advanced to date, according to Henschen.
"Databricks was in a better position than many of its competitors to help customers take advantage of generative AI," he said. "It has seized the moment by quickly adding capabilities to help customers use their data to tweak and tune LLMs and bring generative capabilities into production."
Looking ahead, Databricks' generative AI roadmap will focus on continuing to make model deployment and maintenance more simple, according to Prakash.
"We are going to do more of the monitoring and managing," he said.
In addition, model governance is a priority, Prakash continued. Just as data needs to be governed to ensure only certain people within organizations have access to sensitive information, and that only the right people can move and manipulate data to ensure its quality, AI models need permissions.
Henschen, meanwhile, said Databricks should continue executing on its current plans for enabling generative AI development.
Databricks offers Dolly, an LLM the vendor developed so that customers can develop their own generative AI capabilities. In addition, Databricks in June acquired MosaicML to better enable customers to build their own private language models.
Meanwhile, the vendor is building up an infrastructure including Model Serving that creates an environment for AI and ML model training, deployment and management.
"Databricks simply needs to execute on what it has promised by delivering more LLM options and, of course, more custom model building and tuning capabilities by way of the recent MosaicML acquisition," Henschen said.
Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.