VMware has partnered with Nvidia to build a generative AI architecture that lets enterprises maintain data privacy while training large language models, a major concern among companies deploying AI within business operations.
VMware Private AI, unveiled at the VMware Explore conference in Las Vegas Tuesday, comprises Private AI Foundation with Nvidia and Private AI Reference Architecture for Open Source. VMware and Nvidia built the architecture to run on multiple cloud environments, including public, private and edge.
Generative AI promises to dramatically increase the quality of information available to corporate employees by receiving and responding to queries through natural language processing. However, delivering the most valuable data requires companies to train generative AI's LLMs on corporate data.
Cloud providers AWS, Microsoft and Google have generative AI enterprise systems publicly available. However, despite assurances of data privacy, many organizations are hesitant to use those cloud services out of fear that they will share their proprietary information with competitors.
"Private AI is an architectural approach that allows for business gains from AI with the ability to capture the goals of privacy and compliance without compromising business objectives," said Paul Nashawaty, an analyst at TechTarget's Enterprise Strategy Group.
Initial Private AI users are expected to be VMware customers with the resources to run and maintain open source generative AI software on on-premises or edge servers powered by Nvidia GPUs. Dell, Hewlett Packard Enterprise and Lenovo will release the servers by the end of the year, according to Nvidia.
"The level of CIO and CEO interest ... is so high that I can see IT dollars being shifted to get teams ready [for AI]," VMware CEO Raghu Raghuram said during a media briefing following his opening keynote.
VMware expects public cloud providers offering VMware Cloud infrastructure software today to eventually make Private AI available on their platforms. Executives did not provide a timetable for availability on the major providers including AWS, Microsoft Azure and Google Cloud.
Private AI components
The Private AI Foundation architecture, available in early 2024, is built on VMware Cloud Foundation, an integrated software stack bundling NSX for networking, vSphere for computing and vSAN for storage. The platform includes Nvidia AI Enterprise, a cloud-native AI and data analytics software suite.
The joint VMware-Nvidia architecture will provide the software to customize LLMs and run generative AI applications, such as chatbots, virtual assistants, and content search and summarization. The combined technology will enable enterprises to run AI workloads on up to 16 virtual or physical GPUs in a single VM, according to Nvidia.
"We made it possible for VMware to run bare-metal performance [on a virtualized GPU] with all its security and manageability," Nvidia CEO Jensen Huang said during the keynote.
Running and fine-tuning an LLM on premises is less expensive than enterprises expect, said Chris Wolf, VMware's chief research and innovation officer, during a media briefing.
"[For] a well-tuned model, the hardware required for the inference overhead is often smaller than people think," Wolf said. LLM inference refers to the process of generating answers to queries.
VMware ran a generative AI model for 50 to 80 software engineers on a single Nvidia 100 GPU, Wolf said. "These are the things that we're looking to share more with customers because sometimes there's this perception that you need all these GPUs," he said.
VMware is aiming its Private AI Reference Architecture at enterprise customers that want to build and feed data to open source software running on VMware Cloud Foundation. Partners developing software that would run on the architecture include Anyscale, a computing platform maker for running AI and Python applications; Domino Data Lab, an analytics and data science platform developer for AI and machine learning in the financial industry; and Hugging Face, a tools developer for machine learning applications.
Antone Gonsalves is networking news director for TechTarget Editorial. He has deep and wide experience in tech journalism. Since the mid-1990s, he has worked for UBM's InformationWeek, TechWeb and Computer Reseller News. He has also written for Ziff Davis' PC Week, IDG's CSOonline and IBTMedia's CruxialCIO, and rounded all of that out by covering startups for Bloomberg News. He started his journalism career at United Press International, working as a reporter and editor in California, Texas, Kansas and Florida. Have a news tip? Please drop him an email.