The phrase redundant array of inexpensive disks (RAID) -- later changed to redundant array of independent disks -- emerged in the late 1980s, when mechanical hard disk drives (HDDs) were the primary storage media. The primary purposes for RAID were to improve performance and provide fault tolerance.
Technology vendors have since extended the concept of RAID to servers and storage systems that use higher performance NAND flash-based SSDs. SSD RAID is primarily used to protect against data loss in the event of a drive failure.
Storage systems in general have moved on from applying RAID at the whole-drive level, and redundancy is now applied to data at a finer granularity. As with conventional HDD-based RAID, data can be divided at the block level and distributed across multiple SSDs in a variety of ways.
There are three key concepts in RAID: mirroring, in which data is written simultaneously to two separate drives; striping, in which data is split evenly across two or more drives; and parity, in which raw binary data is passed through an operation to calculate a binary result, or parity block, used for redundancy and error correction.
Greg Schulz, founder of analyst and consulting firm StorageIO, discusses how RAID can prolong an SSD's lifespan.
Standard RAID levels in use with HDD- and SSD-based systems include RAID 0 (simple striping); RAID 1 (simple or multimirroring); RAID 3 (byte-level striping, plus one drive dedicated to storing parity information); RAID 4 (block-level striping with a parity drive); RAID 5 (block-level striping with distributed parity, which requires at least three drives); and RAID 6 (block-level striping with a double distributed parity scheme).
Striping, with no redundancy or parity, is often used to increase performance. Striping with parity or double parity strengthens data protection. With most RAID types, storing redundant data blocks enables the system to reconstruct the lost information if one or more drives fail.
HDD-based RAID vs. SSD-based RAID
One of the primary purposes of HDD-based RAID was originally to increase performance. An operating system (OS) would see the HDDs as one logical storage unit, but because read and write operations are spread across multiple storage drives, inputs/outputs (I/Os) could be aggregated and carried out simultaneously, thereby speeding up performance and increasing throughput.
Storage systems generally do not use RAID to pool SSDs for performance purposes. Flash-based SSDs inherently offer higher performance than HDDs, and enable faster rebuilds in parity-based RAID. Rather than improve performance, vendors typically use SSD-based RAID to protect data if a drive fails.
Some flash array vendors have developed SSD RAID strategies they claim go beyond standard RAID and offer advantages, such as minimizing the performance impact of some types of RAID. Other reasons flash storage vendors consider changes or alternatives to standard RAID include the differences in the way HDDs and SSDs fail.
When an HDD fails, the entire drive is lost. With an SSD, only a part or parts of the drive may fail. As a result, some vendors have weighed customized approaches to RAID for protection against a drive failure.
Examples of non-standard RAID in current use with all-flash arrays include Dell EMC's XtremIO Data Protection (XDP) and Pure Storage's RAID-3D.
One distinction between XDP and standard RAID algorithms is a reduction in the I/O operations required per stripe update, according to Dell EMC. Dell EMC claimed that prior RAID algorithms had to consider how to keep data contiguous to avoid disk drive head seeks, whereas XDP presumes random-access media, such as flash, and is able to lay out and read back data with greater efficiency.
RAID-3D treats a performance delay as a drive failure and uses parity to address bottlenecks and facilitate consistent latency, according to Pure Storage. Pure Storage claimed that RAID-3D also uses independent checksums and dedicated parity to detect and address bit error.
NetApp's SolidFire all-flash array uses a distributed replicated algorithm, known as Helix, as an alternative to traditional RAID. Helix spreads redundant copies of data across the drives in a storage cluster, rather than a limited RAID set, according to the vendor.
SSD RAID array vs. HDD RAID array
The term SSD RAID is sometimes used as an alternative name for a storage array that is equipped with flash-based SSDs and uses a form of RAID.
Advantages of SSD-based storage arrays over HDD-based storage arrays include reduced access time and superior I/O performance. However, ideal SSD RAID performance requires the optimum combination of microprocessor, cache, software and hardware resources. When all these factors work together in the best possible way, an SSD RAID can significantly outperform a RAID of comparable HDD-based storage capacity.
A typical SSD consumes less power than an HDD. When large numbers of drives are combined, the power savings of an SSD RAID array compared with an HDD RAID array can translate to lower long-term operating costs. In large data centers, the improved efficiency of SSDs compared with mechanical HDDs can also reduce the cooling cost, both in terms of simpler cooling systems and lower electric bills.
Cons of SSD RAID
SSD RAID has limitations and drawbacks, largely related to the storage media. SSDs carry a higher price per gigabyte compared to HDDs of comparable storage capacity. NAND flash-based drives are limited to a certain number of program/erase cycles before they wear out, become unreliable and require replacement.
Although the best SSDs have life expectancies comparable to mechanical HDDs, the replacement cost for an SSD exceeds the replacement cost for an HDD of comparable storage capacity.