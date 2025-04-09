Google Gemini 2.5 Pro will become the first proprietary frontier large language model available for on-premises deployment later this year, including in air-gapped environments.

Other major frontier large language model (LLM) providers, such as Anthropic and OpenAI, do not support on-premises deployments of their latest models. Microsoft Azure OpenAI Service supports on-premises deployments of cloud APIs to move them closer to user data but does not have an on-premises version.

Until now, enterprises with security, privacy and cost concerns that avoided cloud-based model access using an API were limited to open source models such as Meta's Llama and DeepSeek, said Chirag Dekate, an analyst at Gartner.

Google will partner with Nvidia to make Blackwell GPU-based Google Distributed Cloud (GDC) appliances available for on-premises private deployments in the third quarter. A Google press release did not specifically mention its latest model, Google Gemini Pro 2.5, released last month, as part of the package. But it does specify that it will support its "most capable models," and mentions a 1 million-token context window, which matches Gemini Pro 2.5.

"Many of our enterprise clients are actively using, evaluating and building around Llama 3 and evaluating Llama 4 … and DeepSeek as well. Nothing's wrong with that," Dekate said. "But when you need enterprise-grade safety, security guardrails, and, more importantly, liability protection and so on, if you want to tap into frontier model innovation, and you're building things on-prem, you are kind of out of luck."

The push for privacy in GenAI Another industry analyst sees this move by Google as an attempt to counter competition from VMware by Broadcom, which has been emphasizing private cloud as a more cost-effective alternative to public clouds, including for AI workloads. "Private cloud and on-premises adoption across the cloud-native ecosystem is top of mind for both vendors and enterprises right now, arising in part around the Broadcom acquisition of VMware, which has impacted many platform provider go-to-market strategies in my orbit," wrote Devin Dickerson, an analyst at Forrester Research, in an email. "These technologies will [get] broad adoption in public cloud, but the reality is that on-premises and private cloud environments remain highly relevant as deployment targets, even for modern applications." Docker Inc. is another vendor pushing into on-premises LLM deployment, adding support for Google's free and open source Gemma model and Llama to Docker Desktop 4.40 last week. Docker Desktop Model Runner brings with it support for these LLMs as Open Container Initiative artifacts that can be stored in containers on developer machines. It will also partner with Google, Continue, Dagger, Qualcomm, Hugging Face, Spring AI and VMware Tanzu AI Solutions to extend local integrations with more AI models and frameworks.

The cost and context problem In addition to security and privacy concerns, the costs of relying on cloud APIs and local model performance are mounting concerns for enterprise developers as they experiment with LLMs, said Nikhil Kaul, vice president of product marketing at Docker, and previously head of marketing for Google's cloud native app development team. "There's no delay in data transmission to and from the cloud server when you're trying to develop locally, on your own existing hardware," he said. "Typically, if you end up using cloud services, you end up paying for those cloud services." The upshot for developers is in the ability to connect [AI] to existing enterprise data and systems -- solving this context problem is far more important for enterprise results with AI than which models they choose. Devin DickersonAnalyst, Forrester Research Dekate said he doesn't expect most enterprise GenAI deployments to run on-premises long-term, but some data and workloads will never be migrated to cloud. "Most enterprises are using GenAI to accelerate migration for data that can be migrated to the cloud [and] trying to spend less on legacy data center infrastructures," he said. "But having done that, what many enterprises are realizing is some of the data cannot be moved to the cloud, even if they want to … [But they] need to be able to tap into a common set of innovative models." Extending beyond public cloud will also help Google Gemini users tap into a more holistic set of data, advancing enterprise GenAI development, Dickerson said. "The upshot for developers is in the ability to connect [AI] to existing enterprise data and systems -- solving this context problem is far more important for enterprise results with AI than which models they choose," he said. "There's a lot you can do with general-purpose tooling, but the real value for enterprise customers comes when the tools become more context-aware within the software development lifecycle."