AI Photo Stock - stock.adobe.com

Vast Data services speed up AI workloads, add intelligence

New services and features from Vast aim to tackle AI workload bottlenecks by embedding more intelligence at the data storage layer.

Vast Data's updates to its AI OS platform this week reflect a broader enterprise infrastructure trend toward tightly integrated software stacks for AI workloads, and potentially moves it into closer competition with the likes of Snowflake and Databricks. 

Vast, which has evolved from a high-performance storage company to offer a software-defined infrastructure platform for AI workloads called the AI Operating System, made a series of updates aimed at improving the performance of enterprise AI workloads by keeping data pipelines close to compute resources, and expanding its data management features. 

Among the new releases are: 

  • VAST CNode-X, a data infrastructure component meant to accelerate enterprise AI data and storage services. 
  • PolicyEngine, a service that governs agentic AI activity.  
  • TuningEngine, a service that manages and automates AI model tuning. 
  • Polaris, a data control plane designed to provision, operate and orchestrate multi-cloud clusters.  
  • A partnership with CrowdStrike to strengthen security in Vast's AI OS.  
  • An expansion to Cosmos Community, its global partner program.  

"People are now realizing that data in particular is critical to AI strategies, and it relates even more so to inference pipelines," said Jeff Denworth, co-founder at Vast. "People are realizing that there's different modalities of data -- enterprise data [and] regulated data now need to come into the fold as people look at building agency into their environments." 

AI workloads require performance and scalability, especially when it comes to data storage. However, traditional storage infrastructure strategies have their limits. Vendors such as Vast are updating their products to incorporate data processing units (DPUs) and software that moves data processing away from slower CPUs and closer to where data is stored.  

With these updates, Vast is positioning itself to better accelerate AI workloads and minimize these bottlenecks, especially during AI inference, a part of generative AI workflows in which a trained model applies what it has learned to specific data.  

Vast and Nvidia collaborate to create CNode-X 

Vast expanded its collaboration with Nvidia to deliver CNode-X, a 2U, two-GPU server node that runs embedded Nvidia acceleration libraries and inference microservices alongside Vast's AI OS as a single unit. It eliminates potential bottlenecks between previously separate systems, according to Vast.  

"This is the first time we've run accelerated services natively within our system … right in a Vast cluster," Denworth said. CNode-X can be used to speed up vector search and inference within Vast AI OS workloads.  

VAST plans to bring CNode-X servers to market through its OEM partners, such as Cisco and Supermicro.  

In general, the market is moving in the direction of quickening AI workloads by keeping data pipelines close to compute resources, said Rob Strechayan analyst at TheCube Research and Smuget Consulting.  

In January, Vast also collaborated with Nvidia to run Vast AI OS on Nvidia BlueField-4 DPUs, another emerging trend to quicken AI inference and data access workloads. DPUs help move data-intensive memory, such as key-value cache, a memory mechanism used by LLMs during AI inference, between storage and GPUs.  

However, it's unclear how significant the cost savings and performance boost from this system consolidation will be, Strechay said.  

"You want [the AI workload] as close to the GPU as possible. ... I think the jury's still out on whether there is a cost savings and a performance enhancement," he said. "We haven't seen that [BlueField-4] with real workloads on it yet."  

PolicyEngine and TuningEngine  

PolicyEngine and TuningEngine are services that Vast will add to its AI OS later this year to govern agentic AI activity, and to manage and automate AI model tuning, respectively.  

Think about this as a next-generation approach to building trust within a platform that arbitrates every single activity in a system.
Jeff DenworthCo-founder, Vast Data

"Think about this as a next-generation approach to building trust within a platform that arbitrates every single activity in a system, not just access to data, but the memories that agents retain, the ways in which agents communicate with other agents, as well as other tools," said Denworth. 

Denworth described PolicyEngine as a service that examines every event that happens in a system to determine if an action is allowable or not. It also determines the types of data and the ways in which it can be presented to agents. 

PolicyEngine is meant to sit among every element in the Vast AI OS system, including between agents, agents and memory, agents and Model Context Protocol tools, as well as other operations.  

TuningEngine is an AI model fine-tuning framework for Vast AI OS that operates in conjunction with the policy engine to ensure that any changes to a model are allowable according to user-set policies. It works by collecting data from agents deployed in a framework through an extract, transform, load process that creates artifact tables. The artifact tables are then fed into a set of tuners. The process can be automated using reinforcement learning, which will evaluate and then deploy the fine-tuned models. 

Strechay predicted that this additional intelligence within Vast's data platform will eventually make it more competitive against data management vendors such as Databricks and Snowflake. 

"I think Vast looks at it as going up the stack," said Strechay. "When you look at a data platform ... there's different layers. There's the storage layer ... a storage services layer. ... As you start to move up, then you get into metadata and [governance, risk and compliance], which is where PolicyEngine is starting to play -- and what it's trying to do is keep moving up."  

Polaris multi-cloud control plane  

Polaris, also among the product updates rolled out this week, is a multi-cloud control plane that Vast Data designed to provision, operate and orchestrate its AI infrastructure using Kubernetes clusters deployed in public cloud platforms including AWS, Azure, Google Cloud and Oracle Cloud Infrastructure. Polaris will also "be extended more broadly for people that have fleet-level scale operational challenges," Denworth said, such as neocloud and on-premises AI training and inference environments. 

Polaris deploys a Kubernetes operator and the Polaris agent locally in each environment or location where data or AI workloads live. It provides management, governance and orchestration of data across environments, as well as security and isolation via Vast Gateway.  

Vast Gateway "is off by default, but allows organizations to manage their global fleet from one portal without exposing their infrastructure or requiring a full stack to be deployed everywhere," said Jonsi Stefansson, general manager of cloud at Vast.  

Polaris has several competitors in distributed cloud data storage infrastructure systems, including Nutanix, which added support for on-premises, disconnected infrastructure in its Nutanix Cloud Platform in December. Public cloud providers also offer distributed and hybrid cloud products such as Microsoft's Azure Stack. NetApp supports multi-cloud infrastructure with its AFX systems and AI workloads with its AI Data Engine, launched in October.  

"IBM had some [updates] in this space around AI as the storage admin, and I think that's where Polaris ultimately goes from a control plane perspective," Strechay said. "I think it also shows that some of Vast's deployments are getting massive -- Polaris is really a very large potential equalizer [with competitors]." 

Dig Deeper on Storage management and analytics