SALT LAKE CITY -- Red Hat will buy a top contributor to a key LLMOps utility used by OpenShift AI that supports self-hosted large language models on standard hardware.

The deal, for an undisclosed sum, was publicized this week during KubeCon + CloudNativeCon North America as users of the company's internal developer platform gathered at a co-located OpenShift Commons event. Neural Magic, based in Somerville, Mass., specializes in advanced techniques for optimizing the LLMs that underpin generative AI applications to perform well in a variety of IT infrastructure environments. The company was founded in 2018 by an MIT professor and researcher with the goal of decoupling generative AI applications from expensive and often hard-to-find GPU hardware.

Neural Magic's focus on widening infrastructure support for LLMs is in keeping with both Red Hat's hybrid cloud strategy for its developer platforms and commitment earlier this year to support open source AI models, according to company officials.

"We see the future of AI being accelerated through open [source]," said Red Hat CTO Chris Wright during a press conference Tuesday. "Our goal is to build this scalable, trainable AI infrastructure that allows our customers to deploy their workloads, train their workloads and deploy inferencing anywhere that makes sense for their business."

Neural Magic employs two of the top 10 contributors to the vLLM project, described on its GitHub page as "a high-throughput and memory efficient inference and serving engine for LLMs." The vLLM library has shipped as part of Red Hat's RHEL AI and OpenShift AI project since midyear.

Within OpenShift AI, vLLM functions similarly to a traditional web application runtime server, but optimized to run an LLM, according to Derek Carr, senior distinguished engineer at Red Hat, in an interview with TechTarget Editorial at OpenShift Commons.

"In a traditional Java app, you have a JAR [Java archive] or WAR [web application archive] file and you give it to something like [Apache] Tomcat or JBoss to run it," Carr said. "Instead of giving it a JAR, you give it an LLM."

The acquisition means Red Hat will bring in engineers with expertise in LLM training, serving and inferencing as enterprises struggle with return on investment and data privacy concerns with generative AI. These issues have some companies exploring the idea of hosting generative AI workloads themselves rather than paying a public cloud provider to take in sensitive model training data, according to industry analysts.

"Having smaller models closer to the user and being able to manage model sprawl are tough challenges that this acquisition has helped [for] Red Hat," said Rob Strechay, an analyst at TheCube Research. "OpenShift AI is doing extremely well in enterprise organizations … still trying to get to ROI. This addition will take models into the corners of an enterprise's deployments, such as on the manufacturing floor and telco colocation [facilities]."

Developer platforms pivot into LLMOps OpenShift AI users who presented at Commons expressed interest in vLLM and other LLMOps features, but it's still early yet, even for companies as experienced in AI and machine learning as Mastercard. On Tuesday, reps for the credit card issuer talked about the recently launched version 2.0 of the AI-Workbench platform they maintain for machine learning operations services, which is now based on OpenShift AI. Version 2.0 offers a self-service "playground" that automates deployments of Apache Spark behind the scenes. LLMOps is still on the roadmap, said Ravishankar Rao, principal software engineer at Mastercard, in an interview with TechTarget Editorial following the presentation. "Soon we'll have LLMOps as a service based on Nvidia NIMs [inference microservices], and we want to bring in use cases to run against company-specific data," Rao said. "We're working with OpenShift AI to evaluate vLLM." High-performance computing (HPC) engineers from New York University said their platform is still undergoing LLMOps "growing pains," partly because of overlap with internally developed Kubernetes and cloud platforms that must be migrated into OpenShift AI. "We're still in an early pilot phase for a few isolated things with OpenShift AI," said Carl Evans, senior HPC specialist at NYU, during a Q&A session at Commons. "But there's stuff we want to bring in house [from public cloud] … to protect student data."