VMware and Nvidia are teaming up to make virtual GPU services available to VMware Cloud on Amazon Web Services, which uses the capabilities of Nvidia's T4 GPUs and Virtual Compute Server software. This lets organizations run high-performing applications in a virtual infrastructure.
With Virtual Compute Server (vComputeServer), VMware GPU virtualization technology enables organizations to deliver vGPUs to virtual servers based on VMware vSphere and other hypervisor offerings. These Nvidia technologies enable organizations to provide IT teams with VMs that support GPU-accelerated workloads such as AI, machine learning and advanced analytics, all while reaping the benefits of server virtualization.
A look at virtualizing GPU-accelerated workloads
Organizations traditionally run GPU-accelerated workloads on bare metal and limit virtual servers to CPU-exclusive applications. This meant organizations had little choice but to incur the hefty costs that came with dedicated hardware deployment and management. To make matters worse, IT teams running these applications would often have to deploy and manage their own hardware, leading to additional costs and overhead.
VDI is the one exception to the CPU-exclusive rule, as it has incorporated vGPU technologies for quite a while, letting server virtualization lag. This standard has changed with the release of vComputeServer, which sets a foundation for VMware GPU virtualization. Organizations can now run GPU-accelerated workloads on its virtual servers in the same way it runs CPU-exclusive workloads and brings the benefits of server virtualization to compute-intensive applications.
One benefit is increased GPU resource utilization, which leads to cost savings and resource flexibility. Organizations can also launch compute-intensive applications faster and more efficiently with less administrative overhead and increase cost savings. IT teams can extend the security benefits of server virtualization to GPU-accelerated workloads, taking advantage of isolated VMs.
Applications hosted on virtual servers with vGPUs run substantially faster than CPU-exclusive applications and have similar performance levels to workloads on bare-metal servers. This is especially important for compute-intensive workloads such as artificial intelligence, machine learning, deep learning, predictive analytics and high-performance computing. Vendors designed GPUs to accelerate critical compute operations and accommodate these workloads.
Inside Nvidia vGPU technology
Nvidia vGPU technology includes physical GPUs and the Nvidia vGPU software, which enables application acceleration on both the virtual server and VDI. Organizations install the software at the virtualization layer along with the hypervisor, then enable VMs to share underlying GPUs.
A Nvidia driver is also installed in each VM to facilitate offloaded work from the CPUs to the GPUs. As a result, applications running within VMs benefit from the physical GPUs installed on the server.
Nvidia offers four vGPU software editions: GRID Virtual PC, GRID Virtual Applications, Quadro vDWS and vComputeServer. Only vComputeServer is specific to server virtualization.
The vComputeServer software virtualizes the physical GPUs and makes them available to compute-intensive applications running on virtual servers. Like the other vGPU software editions, vComputeServer is a licensed product available only to Nvidia's GPUs.
Unlike the other editions, Nvidia does not tie the vComputeServer license to a user with a display. Instead, Nvidia licenses the product on a per-GPU basis as a one-year subscription, which includes Nvidia enterprise support. This approach is especially beneficial in a setup where multiple VMs access a single GPU.
Nvidia's vGPU technology supports several types of Tesla vGPUs. Each vGPU type offers a specific number of frame buffers, display heads and maximum resolutions. Nvidia groups the vGPU types into different series based on their workload classifications.
The vComputeServer software is only compatible with C-series vGPUs, which support compute-intensive workloads. Each physical GPU installed on a server running vComputeServer can run up to eight C-series vGPUs.
The software also supports two vGPU implementation models: GPU sharing and GPU aggregation. The GPU sharing model enables multiple VMs to share a single GPU just as multiple VMs can share the same CPU. This helps optimize resource utilization and reduce costs. The GPU aggregation model enables a single VM to use multiple GPUs and support more demanding workloads.
It also incorporates error correction code and page retirement features to prevent data corruption and ensure higher resource reliability, which is extremely important for massive data set processing.
VMware Cloud on AWS to support GPU-accelerated workloads
Organizations have much to gain by virtualizing their compute-intensive workloads, especially if they're already committed to server virtualization and the vSphere ecosystem. Nvidia's vGPU technology makes it possible to effectively use GPU resources and extend GPU capabilities to teams and organizations that couldn't justify the investment in bare-metal solutions.
Once these capabilities are available to VMware Cloud on AWS, customers will have more options for implementing their GPU-accelerated workloads and VMware GPU virtualization. They'll be able to incorporate the workloads into their hybrid infrastructure and seamlessly migrate vSphere-based applications to VMware Cloud on AWS, which runs on Amazon Elastic Cloud Compute bare-metal instances.
Until then, organizations must be content to run vComputeServer on-premises if they use the technology. Of course, IT teams must do a thorough cost analysis to determine whether the Nvidia technologies are worth the investment for their particular circumstances, but the effort could be worthwhile for organizations that realize the full potential of vGPUs.