SAN management best practices for optimizing performance
SANs consolidate resources and provide a high-speed network for storage traffic, but that doesn't mean you're off the hook when it comes to performance management.
Storage area networks can deliver the performance required by even the most demanding enterprise workloads. Even so, there are several SAN management best practices that must be followed to achieve the optimal performance.
One of the key tasks required for keeping a storage system healthy is to proactively monitor the storage infrastructure. SAN vendors generally provide a management tool that you can use to configure and monitor their products, but third-party management tools are also widely available.
When it comes to infrastructure monitoring, the best approach is to begin by identifying several KPIs that reflect the overall storage health. Often, vendors will recommend the specific KPIs that should be monitored.
Once these KPIs have been identified, the next step is to take a baseline reading. Future values can then be compared against the baseline values as a way of assessing the storage performance and overall health. This type of proactive monitoring can help an organization detect and correct issues before they become a more serious problem. Most third-party management tools include an alerting mechanism to signal when KPIs exceed predetermined thresholds.
Capacity and performance management
One of the keys to achieving an optimal level of SAN performance is to strike a balance between capacity and performance. Suppose that an organization must equip a storage array with a specific amount of storage capacity. From a performance standpoint, it's usually better to reach that capacity by using numerous small disks than a few large ones. This strategy enables the storage array to handle a greater number of IOPS because read and write operations are being distributed across a greater number of disks.
One of the most important things that an organization can do to ensure that its SAN-based volumes deliver optimal performance is to carefully choose the RAID architecture to be used.
RAID configuration usually means striking a balance between performance and fault tolerance. Raid 0, for instance, is known as a stripe set. A stripe set consists of multiple disks, working together as one. Because storage IOPS are distributed across these disks, the stripe set delivers far better performance than what can be achieved with a single disk. However, a RAID 0 array offers no fault tolerance. If even a single disk in the stripe set were to fail, then the entire volume will fail.
Conversely, a mirror set duplicates a disk's contents to one or more additional disks. Mirrors provide full redundancy to protect against failure but do nothing to improve performance.
The RAID level of choice in the enterprise is usually RAID 10, sometimes called RAID 1+0. It's essentially a stripe set in which every disk in the set is mirrored. A RAID 10 array delivers the performance of a RAID 0 array but with the fault tolerance of a mirrored set.
The most important thing to consider when it comes to LUN mapping is that not all storage hardware is created equal. It's important to consider a workload's performance and storage capacity requirements when mapping a LUN.
The key is to map a LUN that matches the workload's requirements as closely as possible. If a LUN is unable to keep pace with a workload's demand for storage IOPS, then the LUN will become a bottleneck and performance will suffer. Conversely, if the LUN's capabilities far exceed what the workload needs, then the hardware will be underutilized, meaning the organization is wasting money on underprovisioned hardware.
With the ongoing COVID-19 pandemic, it has become increasingly important to be able to remotely manage SAN systems. Remote management capabilities enable you to ensure ongoing health and performance, even when you are working from home.
One of the most important things to look for in remote SAN management software is the ability to examine all the various levels of the storage infrastructure. Performance issues could conceivably stem from any layer of the stack, so it's important for you to be able to access more than just basic performance and capacity metrics.