Sergey Nivens - Fotolia


Manage large volumes of VM data with these storage tactics

Admins can effectively virtualize large servers with storage tiering and disk allocation, but they must be wary of potential IOPS and redundancy issues.

Many organizations still struggle to virtualize servers that contain large amounts of VM data. But with the creative application of storage tiering and disk allocation techniques, admins can finally virtualize all of their servers.

After the advent of virtualization, most organizations started virtualizing low-hanging fruit. As the technology proved itself, organizations virtualized more and more important production servers. The last remaining servers weren't always the most critical CPU-heavy production servers -- they were usually the largest ones.

Most data centers have some of these heavyweights. They're large in terms of disk size, but not in terms of CPU, memory or even disk I/O. The data in these servers can stretch to multi-terabyte levels.

Organizations often leave large servers out of the virtualization process because the amount of VM data they produce can use too much expensive shared storage. This problem is even worse with solid-state drive frames.

Large VMs continue to present several challenges, but with a few techniques, admins can figure out how to store them.

Use storage tiering to reorganize VM data

One of the key questions with large VMs is where to put them. All-flash arrays are expensive, so admins must understand their data profiles. If VMs are mostly unused or if the VM data is rarely useful, then all-flash arrays aren't the right place for them.

Storage tiering can enable admins to shuffle unused VM data onto slower disks so the frame can do it all. However, storage tiering is often expensive, and it requires disks with enough speed and capacity to make it work.

If admins can't afford to implement this method, they might have to get creative with disk allocation. Admins should first see if they can separate VM data. Depending on the guest configuration, multiple drive letters can enable multiple hard drive files. This enables the manual separation of the drive files, so admins can use slower or even local disks rather than an expensive storage area network (SAN) disk.

If admins can't split the drive or group it into logical segments, other challenges can arise.

Admins must be cautious, though, because separating VMs on different frames can cause issues if one disk location fails without adequate redundancy. Manually tiering disks isn't easy, and allowing the frames to do it is best, if possible.

If admins can't split the drive or group it into logical segments, other challenges can arise. VMware and Microsoft systems can support large volumes of VM data, but that isn't an ideal situation for admins to be in.

As these large servers grow, admins often add and configure more disk groups, but these additions don't always offer the same disk speed or I/O profile. Having different IOPS for parts of the same disk volume can lead to unpredictable performance.

Large VMs challenge hyper-converged infrastructure

Hyper-converged infrastructure (HCI) platforms can also present problems with larger VMs. Admins can use HCI platforms for large VMs, but by their nature, HCI platforms aren't the right fit. HCI platforms combine the necessary resources for virtualization in a convenient, self-contained package.

Technically, admins could map external storage to an HCI platform to deal with extensive VM data, but that tactic defeats the overall purpose of a compact HCI platform.

Admins might not be able to split a single large volume's load on the storage controller of the SAN or network-attached storage. This is dependent on storage frame features, but load balancing in storage frames focuses on volumes, not bandwidth, so it's possible to overload a controller or front-end port.

There isn't a perfect solution to storage overload because it's not always possible to shift or separate VM data, but admins should try to partition the storage before it continues to grow.

VM data presents backup challenges

Backing up large data volumes can also pose problems. In theory, traditional snapshots address this, but the delta change logs on a multi-terabyte volume can quickly grow out of control. Backup tools that incorporate some level of deduplication are critical to avoid the risk of runaway and corrupt snapshots. Block-level deduplication is preferable, but admins should at least use file-level deduplication.

Large volumes are often too large to do full backups outside of production hours, so synthetic full backups and deduplication are essential. The same challenges can arise with traditional, non-hypervisor-based antivirus scanners. Scanning large volumes of VM data with traditional tools can take an excessive amount of time and can monopolize server resources.

Admins must consider protecting these types of volumes at the hypervisor layer. This can be better for performance and security, but if admins don't plan for it, adding a tool that enables this can prove costly.

Many other traditional tasks, such as Storage vMotion, run into challenges when VM data reaches the multi-terabyte range. This can be even more difficult if storage has additional logical unit numbers that don't have the same I/O profile.

Dig Deeper on Containers and virtualization

Software Quality
App Architecture
Cloud Computing
Data Center