NAND flash SSDs continue to be a hot topic for IT pros. They've proliferated across workplaces and data centers...
and are found in a variety of storage systems, including SANs, NAS, DAS, and converged and hyperconverged infrastructures. Most storage acquisition discussions now include flash SSDs. However, there is a wide range of products, and deciding on the best one can be a difficult process.
An important factor in choosing a flash SSD is the number of bits per cell. Vendors offer SSDs that use one of four types of NAND cells:
- Single-level cell (SLC): 1 bit.
- Multi-level cell (MLC): 2 bits.
- Triple-level cell (TLC): 3 bits.
- Quad-level cell (QLC): 4 bits.
Vendors are also working on penta-level cell (PLC) flash SSDs, but the consensus among manufacturers and industry analysts is that we're at least two years off from any large-scale PLC flash shipments. However, Solidigm has built a working prototype of a PLC drive.
Regardless of any PLC progress being made, SLC, MLC, TLC and QLC are the current NAND flash options. Each of the four types offers both advantages and disadvantages in terms of performance, capacity, durability, reliability, power efficiency and costs. Here are a few general facts to consider when choosing a flash SSD:
- Capacity increases exponentially with each bit added to the memory cell. Capacity, in this sense, refers to the number of bit states supported by each cell.
- Performance decreases with each bit added to the cell. In addition, more power is consumed to handle the read/write operations.
- Write endurance decreases with each bit added to the cell. Endurance is measured by the number of supported program/erase (P/E) cycles over the SSD's lifetime.
- Errors increase with each bit added to the cell, putting more demand on the controller to accurately read data, while increasing the amount of error correction code in the SSD controller.
When performance and reliability are the top concerns, SLC is a good choice, although MLC can be a viable alternative, depending on the workloads. When capacity and cost are the primary drivers, QLC is the best choice if running read-intensive workloads. Otherwise, TLC's higher reliability and write endurance might be the better option. Choosing a flash SSD is a matter of tradeoffs among performance, reliability, capacity and costs.
Flash SSDs have made significant inroads into today's data center, but they're still generally more expensive than HDDs when measured on a cost-per-gigabyte basis. For mission-critical workloads, however, the added expense can often be justified because of the less tangible benefits they offer, such as improved customer service, increased productivity or greater competitive edge. However, the growth of data from IoT, 5G, AI and machine learning has created a need for storage that can deliver both performance and capacity, which is why there are SSDs that can store more data more efficiently and can help to bring down the per-gigabyte cost.
Flash SSD cost and capacity
The flash industry often employs two strategies for increasing capacity and thereby decreasing costs. The first of these, as already pointed out, is to add more bits per memory cell.
The earliest flash SSDs contained only 1 bit per cell, with each bit always in one of two states: 1 or 0. Every bit added beyond that exponentially increases the cell's number of potential bit states by a factor of 2n, where n is the number of bits in the cell.
By increasing the number of bits per cell, manufacturers can increase the density of a NAND chip while maintaining the same chip size. In this way, they can add capacity without adding significant costs to the manufacturing process, helping to reduce the overall cost per gigabyte.
The second approach for increasing capacity is the move from 2D (planar) NAND chips to 3D chips. The 3D technology enables manufacturers to layer NAND cells on top of each other, helping to increase density, while reducing I/O. This approach comes with several challenges, however, in terms of performance and reliability.
Fortunately, NAND manufacturers have been able to address many of these issues by improving their manufacturing processes, updating controller software and introducing design modifications.
In fact, vendors now sell TLC and QLC 3D flash drives that have more than 200 layers and can store more than 1 TB of data. 3D technology has greatly increased flash SSD capacities and lowered the total cost per gigabyte, although it comes with tradeoffs in endurance, performance and error rates.
Tradeoffs in errors and performance
More bits per cell decrease performance, endurance and reliability, and they increase the error rate. For each bit added, it takes more time to write to and read from a cell. The additional bits also require more voltage to create and identify the bit states within each cell, as well as to control the flow of current across the transistor channel.
Increasing the number of bits increases a cell's exposure to noise, process variances and potential chip defects. Greater cell densities come with increased susceptibility to temperature variances. For example, higher temperatures can cause greater cell-level electron leakage. In general, the more bits per cell, the narrower the operating temperature range. Operating outside this range can lead to an increased error rate and the potential for data corruption.
Additional bits per cell also mean that the flash controller must incorporate more complex error-correcting technology. As error-correcting requirements increase, so does the time needed to carry out these operations, resulting in decreased IOPS and increased latency. That said, 3D layering can improve IOPS because it consolidates more cells onto a single chip, mitigating some of the impact that comes from the error correction operations.
Clearly, the number of bits per cell plays a vital role in flash SSD performance, but it is not the only factor. For example, many enterprise SSDs use some type of caching mechanism to improve performance. The cache might be made up of DRAM, storage class memory, SLC flash or another type of memory. The stronger the cache, the better the performance. Storage tiering can also improve performance, but this too depends on how effectively it's been implemented.
Most SSDs also use overprovisioning to ensure that write operations have immediate access to pre-erased blocks, helping to increase performance. Overprovisioning is the process of reserving a certain amount of storage space for the overhead that comes with managing storage and optimizing write operations.
Overprovisioning makes it possible to distribute P/E cycles over a greater number of memory blocks. The amount of overprovisioned storage varies among SSDs. Generally, as the bits per cell increase, so does the level of overprovisioning.
The SSD interface also affects performance. NVMe is a faster interface than SATA or SAS. In addition, the shared storage architecture and storage networking interconnect can play important roles in performance. For example, the use of NVMe-oF over Ethernet, Fibre Channel or InfiniBand networks can greatly improve performance when compared to traditional transport protocols.
Tradeoffs in endurance and reliability
Each bit added to a NAND flash cell significantly reduces its endurance. Endurance is measured as the number of writes -- P/E cycles -- that the cell supports before it wears out:
- SLC is rated at approximately 100,000 P/E cycles per cell.
- MLC is rated at approximately 10,000 P/E cycles per cell.
- TLC is rated at approximately 3,000 P/E cycles per cell.
- QLC is rated at approximately 1,000 P/E cycles per cell.
The number of supported P/E cycles can vary for each cell type, as can the estimates coming from various sources. The amounts shown here represent the generally accepted top range of today's SSDs.
Another consideration is the SSD's endurance as a whole, rather than at the individual cell level. Two major factors affect a drive's overall endurance: the controller's capabilities and how much of the SSD has been overprovisioned.
An SSD's controller manages all aspects of the drive's operations, essentially serving as its local processor. The controller facilitates access to the NAND chips, manages the flow of data and tracks the address mapping. It also finds and corrects errors, and it applies garbage collection and wear-leveling algorithms to the data blocks.
Through its various operations, the controller helps to maximize performance and endurance. The more efficiently it carries out its operations, the better the drive performs and the longer it operates at peak efficiency.
In addition, SSDs use overprovisioning to enhance endurance, which also serves to improve performance.
An SSD's endurance rating is often indicated as terabytes written (TBW) or, in some cases, total bytes written. TBW refers to the total amount of data in terabytes that can be written to an SSD before the cells start wearing out. Another rating used to indicate endurance is drive writes per day (DWPD), which indicates the total amount of data that can be written to a drive, based on its capacity and warranty period.
DWPD is directly related to TBW. For example, suppose you have a 3D TLC flash drive with 1 TB of storage. The drive comes with a five-year warranty and is rated at 0.66 DWPD. As a result, the drive has an approximate 1,200 TBW rating.
SLC, MLC, TLC and QLC: How to choose what's right for your needs
Choosing the right flash drive typically comes down to balancing capacity and cost against performance, endurance and errors.
QLC flash SSDs generally have the lowest cost per gigabyte but come with serious endurance limitations. Performance is better than any HDD but lower than comparable drives with fewer bits per cell. QLC drives are best suited to data that doesn't change much, such as backups, cold data or even warm data, depending on the workloads. QLC flash SSDs are similar in many ways to more traditional write once, read many storage systems, like tape backups, but with a lot better performance.
SLC flash is often the best choice for mission-critical workloads that require top performance, endurance and reliability. For this reason, SLC flash is often used for military, aerospace or similar uses. Some SSDs also use SLC flash for the cache. In fact, SLC-based data caching demonstrates the huge performance advantage of SLC over TLC or QLC products.
If an organization has budget constraints but still wants storage that can deliver good performance, endurance and reliability, MLC flash might be a reasonable alternative to SLC, depending on the workload and nature of the data. Although basic MLC chips tend to be used for consumer products, improvements in enterprise MLC have led to drives that deliver greater reliability and endurance, making them better suited to enterprise or industrial use.
Even the fastest SLC or MLC storage isn't necessarily going to perform a great deal better than TLC or QLC flash if admins don't address other performance-related factors. For example, an SLC drive that uses a SAS or SATA interface rather than NVMe is never able to perform to its fullest potential because of the interface's limitations. On the other hand, NVMe comes with its own tradeoffs, such as higher CPU resource consumption.
Deciding on SLC vs. MLC vs. TLC vs. QLC requires weighing the pros and cons of each type. Here are several general guidelines to consider when evaluating an SSD based on the number of bits per cell:
- When performance and reliability are the top concerns -- and cost is less of a consideration -- the best bet is usually SLC flash, using either NVMe or NVMe-oF.
- When performance is important but so is cost, MLC might be a better alternative than SLC. In some cases, even TLC might work, depending on the workload and data requirements.
- When implementing a high-performance shared storage system, the cell bit count should be factored in with other considerations, such as the type of network or interconnection.
- When cost per gigabyte is a primary consideration and performance and write endurance are less of a concern, either TLC or QLC is likely the best option. The choice between the two often comes down to how write-intensive the workload is. QLC is a good option for read-only workloads that require reasonable performance.
Today's SSD market
Organizations can choose from a variety of flash SSDs. Kioxia, for example, offers a wide range of SLC flash memory chips for both consumer and industrial uses, such as wearables, IoT devices, transportation, automation or digital healthcare. The chips can store 1 GB to 256 GB of data.
In some cases, organizations prefer MLC chips over SLC to save on costs, and here, too, they have plenty of options. For instance, Micron offers a number of MLC chips that can store between 16 GB and 1 TB and are available in both 2D and 3D formats. Multiple vendors offer complete MLC-based SSDs. For example, KingSpec sells a mini SATA drive -- the Yansen YSM600E -- that can store up to 1 TB of data.
Many vendors now offer TLC flash SSDs. For example, Micron has the 6500 Ion, a 3D TLC drive with over 200 layers. The company pits its 6500 Ion against QLC competitors, claiming price point parity with QLC, while delivering TLC performance, endurance and reduced power consumption. SK Hynix is coming out with a TLC flash SSD that exceeds 200 layers.
Another recent development comes out of a joint venture between Kioxia and Western Digital. The collaboration has resulted in a 218-layer QLC flash drive that can use either TLC or QLC technology. Multiple vendors now sell QLC flash SSDs. For example, Solidigm offers the D5-P5430, a 3D flash drive that can store more than 15 TB of data.
This article is part of