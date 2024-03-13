Vast Data is introducing a new cloud AI architecture that uses its parallel file software on Nvidia hardware and is being deployed by CoreWeave. The architecture is designed to increase performance on GPU clusters while providing multi-tenancy for customers.

Ahead of next week's Nvidia GTC event, Vast Data unveiled that it has taken its parallel file operating system and placed it on the latest Nvidia BlueField-3 data processing units (DPUs). The DPUs will offload storage functions and improve security and multi-tenancy for users, leaving the GPUs to process AI workloads.

This use of DPUs is focusing on the promises of the technology, according to Steve McDowell, an analyst and founding partner at NAND Research. Vast's software typically uses a dedicated server, but in this architectural design with certain tasks being offloaded to the DPU and the DPU being able to directly communicate with GPU clusters, a separate server is not necessary.

"This keeps that machine free to do AI [workloads]," he said.

Utilizing DPUs for AI The new architecture will be first deployed by CoreWeave, a GPU cloud service provider Vast began partnering with in September 2023. The BlueField-3 DPUs increase efficiency of the cluster by offloading data processing, which means fewer x86 servers are needed for I/O, according to Vast. Maximum GPU performance is typically gained by giving the user root access to a physical server, according to John Mao, vice president of global business development at Vast Data. This allows everything to be seen on the back end, which isn't ideal for security for either service providers or customers. With the Vast operating system on a DPU, there is a level of insulation for both the customer and service provider, as they still have root access but only through the DPU. Anything that makes [multi-tenancy] simpler is better for service providers. Steve McDowellAnalyst, NAND Research Large GPU clusters from cloud providers including CoreWeave are largely shared between multiple customers, McDowell said. Even internal GPU clusters are shared among different teams, making multi-tenancy a priority. "Anything that makes [multi-tenancy] simpler is better for service providers," he said. The architecture is beneficial as it physically isolates the software stack as well, McDowell said. This allows the GPU compute to get to a customer without exposing what the customer is running.