Next-generation cloud computing demands are forcing data center architects to consider new IT innovations. Faced with the massive growth in applications, such as machine learning, video transcoding and computational storage, there are limitations dictated by the maximum available size of main memory. A disruptive new memory-centric architecture could change future storage architectures.
Current Memory Architecture
Today, main memory is under control of the central processing unit (CPU) as it has been for more than 50 years. As a result, the system architecture is required to conform to its interfaces. This effectively fixes the ratio of memory-to-compute in any practical system, which is an impediment to scaling many memory-centric applications. In order to scale to support data-intensive applications, you have to keep buying more processors to reach the memory requirements.
There are various attempts to circumvent this limitation, but they all have drawbacks. For example, the use of Remote Direct Memory Access (RDMA) architectures requires software to manage the moving of bits from non-volatile storage into and out of main memory, as well as more software to synchronize the distant copies — in other words to provide coherence to the programmer.
Alternatively, several new promising technologies are on the horizon, enabling architects to rethink memory-centric computing. These include:
- The emergence of higher density, byte-addressable nonvolatile memories. These are quickly becoming cost-competitive to dynamic random access memory (DRAM). These new memories could become a new type of main memory.
- The advancement of the programming language P4 and its use in programmable Ethernet switches. This new level of flexibility allows architectures to use low-cost Ethernet hardware for a high performance memory fabric.
- The acceptance of open hardware is unleashing new processor microarchitectures, such as TileLink. Many of the buses and messaging required for multiple CPUs to share cache and main memory are now open.
A new approach that builds on these three technologies is making memory-centric fabric architecture possible.
Enter Cache-Coherent Memory Fabric for Shared Main Memory
As an active participant in the RISC-V ecosystem, Western Digital introduced OmniXtend™ in 2019. OmniXtend provides cache coherent memory over an Ethernet fabric. This memory-centric system architecture is the first cache-coherent memory technology to provide open-standard interfaces for memory access and data sharing across a wide variety of processors, FPGAs, GPUs, machine learning accelerators and other components. It is an open solution for efficiently attaching persistent memory to processors and offers potential support of future advanced fabrics that connect compute, storage, memory and I/O components.
By using OmniXtend, system designers can take advantage of the many benefits of memory-centric architecture. Heterogeneous systems can all reside in the same memory domain and share memory in a coherent way. This new, open approach enables components to share one memory pool through an Ethernet switch.
Figure 1: Components share one memory pool through an Ethernet switch.
OmniXtend was motivated by the desire to unshackle main memory from a CPU and address the urgent need of the RISC-V ecosystem for a common scale-out protocol. Given the new levels of dataplane programmability of P4 Ethernet switches, it is a logical medium to use for transporting the cache coherency messages. OmniXtend uses the programmability of modern Ethernet switches to enable processors’ caches to exchange coherence messages directly over Ethernet fabric. A thorough re-architecting of a compute and storage system can now take full advantage of these new technologies and enable continued scaling into the future.
Extending the Ecosystem
OmniXtend is being further developed in the open source hardware group called the Common Hardware for Interfaces, Processors and Systems (CHIPS) Alliance. This open organization is developing the technical implementation and the open standard for all to use. The programmability of OmniXtend capable Ethernet switches allows any desired modifications to coherence domains or protocols to be deployed immediately in the field, without requiring new system software or new application-specific integrated circuits (ASICs). OmniXtend will accelerate innovation in data center architectures, purpose-built compute acceleration and CPU microarchitectures.