software redundant array of independent disk (software RAID)

What is software redundant array of independent disk (software RAID)?

Software RAID, also known as virtual RAID, is a form of redundant array of independent disks performed on an internal server.

RAID is, in general, a data storage virtualization technology in which multiple physical disk drives are configured into one or more logical drives to achieve data redundancy, usually to optimize performance and minimize the impact of data loss.

RAID is available in two forms: hardware RAID and software RAID. In hardware RAID, the drives in the array are physically distinct and controlled via dedicated hardware on the motherboard or another physical controller. In software RAID, the drives are virtual constructs across physical storage devices, controlled within the operating system (OS). In both implementations, the array appears to the OS as a single logical drive.

What is the difference between software RAID and hardware RAID?

The core concept behind RAID is the that the same data is stored in different places across an array of hard drives or solid-state drives to achieve redundancy, so that the data will be protected and easily accessible in the event any one drive fails. A dedicated controller handles the traffic in the array, to optimize the drive's performance.

This general description covers a wide range of implementations and, in turn, a varying range of data protection. There are multiple RAID levels, but not all RAID implementations are for data redundancy.

In hardware RAID, the host computer's OS communicates with the RAID controller for access to the data.

For end users and applications, the differences between software RAID and hardware RAID are completely transparent. Both RAIDs function logically in the same way and provide the same benefits. But there are two big differences between them:

  • Latency. Software RAID is generally slower than hardware RAID due to the processing burden on the host server. Hardware RAID side-steps the CPU, using the onboard processing of the RAID controller.
  • Cost. Because of the extra hardware, hardware RAID can be considerably more expensive than software RAID, and the controller itself can become a point of failure.
How software RAID and hardware RAID compare.
Software RAID doesn't require a separate controller whereas hardware RAID requires a dedicated controller.

How does software RAID work?

Software RAID is conceptually the same as hardware RAID -- an array of different drives across which the same data is spread in physical locations to protect against data loss and maintain high availability. But where hardware RAID is achieved with separate physical drives coordinated by independent hardware, software RAID is managed directly by the OS without a dedicated controller.

What are the benefits of RAID?

Most RAID levels make data secure against unrecoverable sector errors, thus improving reliability, availability and performance. RAID also boosts a system's fault tolerance. With many drives doing the logical work of one, each device undergoes less wear and tear over time, increasing mean time to failure. MTTF measures the average time a non-repairable component operates before it fails.

Because using multiple disks increases the MTBF, or mean time between failures -- the measure of how reliable a component is -- storing data redundantly also increases fault tolerance. However, reliability and performance can be a trade-off. This often depends on the RAID level being employed.

Why use software RAID?

Lower cost is one of the main reasons to go with software RAID over hardware RAID. With software RAID, the configuration of the array is purely virtual; it can be reconfigured easily and extensively with little difficulty. With hardware RAID, the reconfiguration of an array can often be constrained by the controller.

RAID level refers to the technique a particular RAID uses to implement its data storage. RAID 0, for example, uses a technique called striping, while RAID 1 uses mirroring, and so on. With software RAID, the OS itself often delivers everything needed to implement it.

RAID levels.
A list of conventional RAID levels.

What are the RAID levels?

To fully understand the benefits of RAID, it's necessary to review the different RAID levels.

There are three level categories. These are standard, nonstandard and nested:

  • Standard levels of RAID are made up of the basic types of RAID numbered 0 through 6.
  • A nonstandard RAID level is set to the standards of a particular company or open source project. Nonstandard RAID includes RAID 7, adaptive RAID, RAID-S and Linux md RAID 10.
  • Nested RAID refers to combinations of RAID levels, such as RAID 10 (RAID 1+0) and RAID 50 (RAID 5+0).

The RAID level a storage administrator uses should depend on their site's performance and redundancy requirements. As far as the standard RAID levels go, RAID 0 is the fastest, RAID 1 is the most reliable and RAID 5 is a good combination of both. The best RAID for any organization depends on the level of data redundancy needed, the length of their retention period, the number of disks they're working with and the importance placed on data protection versus performance optimization.

The following are some specifics:

  • RAID 0: Disk striping. With simple disk striping, data is simply divided into chunks and spread across devices without using disk parity. Performance is high, but data protection is weak. RAID 0 is good for high-speed, non-critical applications.
  • RAID 1: Disk mirroring. Disks mirror one another, so if one fails, the other takes over. Data protection is high, but performance isn't, as all inbound data must be written twice -- once to each disk -- slowing things down.
  • RAID 10: Disk mirroring and striping. 10, or 1+0, combines the two approaches above. Data is mirrored first, then striped. Performance is high and protection is high. However, this approach requires a minimum of four drives, and if a stripe drive fails, performance is compromised in the event of failover.
  • RAID 3: Parity disk. A parity disk stores parity data generated by the RAID controller to facilitate reconstruction of lost data. RAID 3 is great for data protection but is complicated, and the parity disk becomes a single point of failure.

There are additional variations on these themes:

  • RAID 4. Parity disk and block-level striping.
  • RAID 5. Disk striping with parity.
  • RAID 50. Disk striping with distributed parity.
  • RAID 6. Disk striping with double parity.
  • RAID 7. Non-standard with caching.

Choosing the best RAID for your needs

When comparing software vs. hardware RAID, there's no definitive answer as to which is better. The choice between the two depends on several factors, including the organization's budget as well as their performance and reliability expectations.

Software RAID is more suitable for non-critical applications that require higher flexibility. Hardware RAID is suitable for mission-critical applications that require maximum speed, availability and security.

Examine the key causes of backup failure and which failures can be mitigated.

This was last updated in April 2024

Continue Reading About software redundant array of independent disk (software RAID)

Dig Deeper on Primary storage devices

Disaster Recovery
Data Backup
Data Center
and ESG