E-Handbook: Programmable processor technology for next-gen data centers Article 4 of 4

Fernando Cortés - Fotolia

Data processing units accelerate infrastructure performance

DPUs often run on networking packets to move information in the data center, instead of supporting processing workflows. Get an overview of the technology and the main vendors.

The CPU has literally been the core of computing for several decades, especially since it became possible to integrate all the circuitry onto a single microprocessor chip. After the use of graphics processing units in servers, new technology -- such as the data processing unit -- is now finding its way into the data center.

CPUs handle application processing and the business logic contained within it. Graphics processing units (GPUs) accelerate the processing of floating-point calculations for applications such as machine learning, analytics and advanced graphics. But where do data processing units (DPUs) fit?

A DPU offloads the networking and communications data processing tasks typically handled by the CPU, which enables the CPU to focus on application support. But as with many new concepts, multiple vendors have adopted the DPU and put their own twist on the technology.

But DPUs aren't as new of a concept as one might think, according to Ben Sheen, research director for the networking and communications infrastructure enabling technologies group at IDC. According to Sheen, network chips have gradually taken on more functions for many years.

"Remember the dot-com era around 2000? Everybody was doing the internet thing and the network processor back then," he said. "Gradually, they become the communication processor because a network processor was kind of like an [application-specific integrated circuit] ASIC and specific for one application: networking. Companies like Marvell [Technology Group] and NXP [Semiconductors], they still make these kinds of communications processors with modern capabilities, such as virtualized network functions."

Vendors tackle DPU design and functionality

The first company to use the term DPU was Fungible Inc., a composable infrastructure startup that designed its own DPU chip in 2016 to form the heart of its architecture and offload the host system's job of handling all I/O traffic.

The DPU's purpose is to "sit between the network fabric and compute/storage elements, and handle data-centric workloads such as data transfer, data reduction, data security, data durability, data filtering and analytics -- all functions that general-purpose CPUs are not very good at," according to Fungible.

Fungible's data processing unit.

The Fungible F1 DPU product brief explains that 52 MIPS64 R6 processor cores are at the heart of the chip.

Other DPUs tell a similar story, so the main difference of this processing hardware is that a DPU is a specialized design that combines processor cores with a collection of hardware accelerator blocks for functions such as encryption, compression and decompression, and erasure coding.

Sheen agreed with this assessment.

"The CPU is a general processor, such as the x86 [architecture]. But in order to handle control functions, infrastructure processors [DPUs] put a different kind of processor core on the system, like Arm cores, onto a [system on a chip] SoC, plus lots of the network packet processing engines," he said.

Nvidia has also made lots of noise about DPUs, especially since it acquired networking specialist Mellanox Technologies in 2020. Nvidia's DPU strategy centers around the BlueField chips that Mellanox developed for its SmartNIC adapters, which combine a number of Arm cores with high-performance Ethernet ports and accelerators for functions such as encryption.

Mellanox has already promoted the ability of its SmartNICs to offload tasks from the host server, such as the overlay network processing overhead imposed by virtual networking protocols -- like Virtual Extensible LAN -- as part of a software-defined networking deployment.

A field-programmable gate array (FPGA) or even an ASIC might perform these tasks, but a SmartNIC with Arm cores offers greater flexibility in that it can easily be reprogrammed to run a new networking protocol or add additional features. Mellanox also promoted the offloading of software-defined storage functions such as compression, deduplication and erasure coding to its SmartNICs, for example.

This capability has caught VMware's attention. At the VMworld 2020 conference, the vendor disclosed it was rearchitecting its hybrid cloud offering under the codename Project Monterey to support offloading of key services to Nvidia DPUs, such as networking, storage and security functions.

VMware envisages its offerings running an instance of its ESXi hypervisor on the SmartNIC/DPU, which will manage the operating environment on the host server. It will also virtualize hardware devices such as FPGA accelerators and make the hardware available to VMs that run on the system.

VMware claims organizations will be able to run both the ESXi hypervisor on a DPU and the host server with the same management framework.

Amazon Web Services (AWS) introduced a similar concept with its Nitro System, which ran the first Nitro offload card deployment in some servers back in 2013. The Nitro System architecture offloads networking, storage, management, security and monitoring functions to a Nitro Card. This makes it possible for AWS to make the hypervisor layer optional and to offer bare-metal instances.

However, AWS not only designed the Nitro Card for internal purposes but based it on an ASIC, which does not make it programmable like some other DPUs.

A company with a slightly different spin on the offload processor is storage startup Nebulon. Its services processing unit (SPU) is based on an Arm SoC, but it appears to the host system as a Serial-Attached SCSI (SAS) storage controller sitting on the PCIe bus.

The processor manages all the drives inside a server and links with SPUs in other servers via dedicated network ports. This enables software-defined storage pools to operate without stealing CPU cycles from the host processor.

Nebulon's SPU is based on an Arm SoC, but it appears to the host as a SAS storage controller sitting on the PCIe bus.

A key piece of Nebulon's offering is a cloud-based management layer, but this could prove to be an issue for many organizations if the use of DPUs becomes more widespread.

"With the data plane or DPU in every server in your data centers, management becomes a lot more complicated," according to Nebulon.

A look ahead for processing capabilities

The DPU's purpose is to handle infrastructure tasks, which have become much more complex with the shift toward software-defined infrastructure.

Not surprisingly, cloud and hyperscale operators have been early adopters, but with companies like VMware aiming to add support for DPUs into their offerings, they will start to filter into enterprise data centers in the future.

"The enterprise market is always a few years behind the cloud," Sheen said. "I think SmartNICs will only be part of new deployments for the enterprise, or if they want to upgrade the equipment or migrate the infrastructure to the cloud. You'll be looking at meaningful growth three to five years lag behind the web-scale cloud operators."

Next Steps

DPU market heats up with tech from Nvidia, Intel

4 trends spurring the evolution of network hardware

Dig Deeper on Data center design and facilities

Cloud Computing
and ESG