To DRAM or not to DRAM? That is the question.
DRAM-less SSDs use some of the host server's DRAM to do everything that an SSD's internal DRAM usually does. Hyperscalers find both a cost and power advantage over standard SSDs.
Certain new data center SSDs have no internal DRAM. It's important to understand how these drives work, because DRAM-less SSDs can benefit some users more than others.
What is a DRAM-less SSD?
Most SSDs include DRAM chips -- sometimes a lot of them. SSDs are not simply a controller and a bunch of NAND flash. The DRAM helps the SSD to manage the intricacies of NAND flash writes. DRAM also helps SSDs communicate with processors through a protocol that was designed around HDDs, which are altogether different from NAND flash.
The SSD's internal DRAM stores metadata, buffers write data, coalesces short writes into longer ones and buffers data that moves around internally to the SSD for garbage collection.
These differences between NAND flash and HDDs also require the SSD to occasionally perform some protracted background housekeeping operations. The host is unaware of these operations and the SSD doesn't know when to expect higher or lower traffic from the host. As a result there are instances when the SSD and the host get in each other's way to degrade overall system performance. The host and SSD are rarely in sync with each other.
The SSD's performance, cost and power consumption all increase with a larger internal DRAM.
Hyperscale data center users, originally led by Baidu, experimented with SSDs that had most of the control functions stripped out. These SSDs replaced the internal RAM with a portion of the host computer's RAM, which is now called the Host Memory Buffer (HMB).
Baidu's approach gave the host greater control over the SSD's timing and operation. Users could get the best performance by tuning the application and system software to the SSD's internal architecture. It's reasonable for hyperscale data centers because they create and control nearly all their own software.
Baidu wasn't the first to use the host's DRAM instead of DRAM within the SSD. Fusion-io, the originator of the PCIe-based SSD, launched this business with DRAM-less SSDs more than a decade ago. Many found this implementation unattractive at the time, since the server's DRAM was used as a substitute, eroding the amount available of it.
This configuration also meant that the server couldn't boot from a Fusion-io drive, since the SSD's operation depended on the server being booted beforehand. But today's DRAM-less SSDs have overcome the bootstrap issue, and attitudes about server DRAM use have changed since then.
How do DRAM-less SSDs compare to DRAM SSDs?
A DRAM-less SSD has a lower bill of materials (BOM) cost to manufacture. The SSD itself will consume less power than an SSD with an internal DRAM.
The function of the internal DRAM may have moved from the SSD to the server's HMB, but this function still requires the same number of DRAM bytes. As a result, an increase in server power consumption will likely match any decrease in the SSD's DRAM power consumption.
The BOM point is valid, though. A conventional enterprise SSD might use $20-100 worth of DRAM. If the DRAM-less SSD can be produced without using that $20-100, then customers should expect for some of that savings to flow through to them.
As Baidu found, an SSD that integrates with application software tighter can also perform better than one that receives commands randomly while trying to manage its internals. However, the software must be tuned to take advantage of this closer coupling.
That closer coupling provides solid advantages to users who have control over their software. The host can use its understanding of the SSD's internal architecture to synchronize I/O requests to the current status of the SSD's internal NAND flash chips. If one part of the SSD's flash is busy, the application can redirect to a task that uses a different slice of the SSD's flash.
Best of all, users can put the SSD's internal housekeeping -- particularly the timing of the garbage collection -- under the host's command instead of the SSD initiating it with a second-guess of the best timing.
Hyperscalers' use of DRAM vs. DRAM-less SSDs
All this control comes at a cost. If the host is to synchronize its application software to the SSD, then the software must be written around the SSD's architecture.
For most businesses, this doesn't make sense since the bulk of their software is not custom created for the application. The cost of developing this software would be prohibitive for these companies.
But for a hyperscaler that will be deploying the same software over tens of thousands of servers, a million-dollar development effort makes perfect financial sense if it leads to $100 in annual savings for each of its 50,000 servers. That's a total savings of $5 million.
There will be other applications where the opposite is true. Certain new SSDs have been designed to communicate through the CXL protocol to achieve persistence at a high bandwidth. These SSDs, which are also interesting to hyperscalers, have huge internal DRAMs to hide their speed deficiencies. They are designed for a layer of the memory-storage hierarchy that evolved for non-volatile dual in-line memory modules and Intel's Optane DIMMs.
Other top uses
Lower-end applications, which put few demands on the SSD, can benefit from the cost savings of leaving the DRAM out of the SSD. Higher-end applications will see the most benefit when paired with customized software. This might include certain scientific applications and highly customized applications that don't simply package commercial software and hardware together but require extensive code development. In these applications a DRAM-less SSD can speed up performance and reduce costs.
As a result, DRAM-less SSDs serve two ends of the spectrum:
- Budget systems that can tolerate slightly reduced performance in return for hardware cost savings.
- Hyperscale and other systems that use customized software to squeeze out important performance gains and lower BOM costs and energy consumption.
For either case, this SSD selection might not be the first consideration while configuring the system. But it can provide important benefits if the organization applies it correctly.