How to choose the correct RAID level
This tip offers nine things to consider when deciding which RAID level is right for your organization's needs.
Aligning the proper RAID level to the application and type of disk drive and budget criteria is as relevant today as it was 10 years ago.
For example, if you're looking for high-performance reads and writes, you'll probably want to use smaller disk drives and avoid RAID 6. If you want to store large amounts of data where rebuilds can occur in the background, RAID 5 and RAID 6 can be a good fit when properly configured to your application needs. If your concern is performance, you should consider a different RAID level to minimize or eliminate performance impacts when a disk drive fails, for example using RAID 1. In the end, it's going to come down to a balancing game between your budget dollars, performance requirements, data availability, capacity, energy consumption and survivability, along with application service requirements, as well as individual or business partner preferences.
When deciding which RAID level is right for your needs, consider the following:
- If you rely on RAID 6 to offset the long rebuild times associated with high-capacity disk drives failing, look at the cause of the root problem to avoid disks that have a higher chance of failing. In other words, avoid using disk drives that have a higher likelihood of failing, or configure using RAID 1 to avoid performance impact due to a rebuild based on disk drive parity.
- How many rebuilds can occur at any one point in time for a RAID controller -- assuming different RAID levels and available spare disk drives? Does a subsequent rebuild, after failed drive is replaced, need to occur to reposition spare disks back to being a spare disk in its original location? Again, if your problem is frequent drive failures, if you do not address the root problem cause, for example unreliable disk drives, you will need to compensate with the ability to support more disk drive rebuild and subsequent performance impact.
- What is involved in migrating a LUN or volume from one RAID level to another, and can it be done by the controller and while data is being read or written? With an emphasis or awareness around tiered storage and policy based data management, having the ability to transparently move data from a LUN to a different LUN on the same or different storage system becomes important. Look for solutions that can transparently move data while being read and written to as well as that support interfaces to various policy management tools. Ask vendors if their data movement tools allow active reading and writing to a file while it is being moved, as well as if they require an application to be paused when data is accessed from a file that has been moved to a different tier of storage.
- What RAID levels are supported? What is the granularity of RAID levels to operate concurrently and over what number and types of disk drives? Also, look into what flexibility you have for tuning RAID, as well as system and automatic RAID tuning by the storage system or controller for hands off operation. The importance of having support for multiple concurrent RAID levels is to be able to place log files for email, database and other applications on RAID 1 or RAID 10 for read/write intensive workloads while leveraging RAID 5 for less update intensive workloads as an example.
- Identify how the RAID implementation, regardless of RAID level, is optimized for sequential large I/O applications compared to random I/O over reads and writes. For example, if you are going to be performing database updates and processing, you want your RAID system to be optimized for small random I/Os, on the other hand, if you are going to be reading large sequential video or audio files, then you want your RAID system independent of the RAID level to support large sequential I/O operations. Keep in mind that there is generally a trade-off between IOPs and throughput (bandwidth or MB/sec), if you see IOPS go up, you should expect to generally see MB/sec go down and likewise, if MB/sec goes up, IOPS should be expected to go down as the I/O size changes. In other words, if you are doing large IOPs, look for IOPs per second to go down while MB/sec goes up.
- For dual or multiparty based RAID implementations, what is done to mitigate performance impact on both reads and writes, as well as during rebuild operations? For example, how does the RAID controller help to accelerate parity calculations as well as data movement to reduce or minimize the time and exposure during a rebuild. Another approach is for the RAID system to pro-actively migrate data from failing drives as well as to avoid false positives, that is, avoid pro-actively failing a disk drive that has a correctable error instead of simply fixing the error.
- If a RAID offload or accelerator engine (chip, ASIC, FPGA) is being used, what functions does it performance, and what is the benefit to your applications? This should be transparent; however the impact is how can the underlying implementation speed up multi-drive parity processing as well as speed up drive rebuild time without negatively impact performance.
- Keep in focus what level of service is needed for your various applications, and why you are using RAID to meet those requirements to align the right approach to situation at hand. Align the right RAID level to meet sequential or random, small or large I/O processing, reads compared to writes. For example, for write intensive avoid RAID 5 or RAID 6 using RAID 1 or RAID 10.
- Look at how cache is integrated and used in conjunction with a RAID controller, including read ahead, write-back, write-though and other operations along with how the cache is protected using mirroring, battery backup and NVRAM. There is a common misperception that more cache is better and that higher cache utilization means good performance. The reality is that some RAID systems need more cache to offset, or make up for the lack of raw I/O performance or ability to move data quickly to or from the disk drives. Look at the effectiveness of the cache, that is, how effective is the cache at being able to reduce response time, and then look at how the cache is being utilized. More is not always better, it's how effective the resources are being used.
Remember: RAID is not a substitute for backup and needs to be used in conjunction with some other form of data protection. By not combing RAID with some other data protection techniques and technology, if a file is deleted, it's gone. However if you have a backup, snapshot or other point-in-time copy or view of the data, then the file can be recovered.
Check out the entire RAID handbook.
About the author: Greg Schulz is founder and senior analyst with the IT infrastructure analyst and consulting firm StorageIO Group. Greg is also the author and illustrator of Resilient Storage Networks (Elsevier) and has contributed material to Storage magazine and other TechTarget venues.