5 common virtualization problems and how to solve them
Organizations can correct common problems with virtualization, such as VM sprawl and network congestion, through business policies rather than purchasing additional technology.
Organizations often have to deal with virtualization problems such as VM sprawl, network congestion, server hardware failures, reduced VM performance and software licensing restrictions. But companies can mitigate these issues before they occur with lifecycle management tools and business policies.
Server virtualization brings far better system utilization, workload flexibility and other benefits to the data center. But in spite of its benefits, virtualization isn't perfect: The hypervisors themselves are sound, but the issues that arise from virtualization can waste resources and drive administrators to the breaking point.
VM sprawl wastes valuable computing resources
Organizations often virtualize a certain number of workloads and then have to buy more servers down the line to accommodate more workloads. This occurs because companies usually don't have the business policies in place to plan or manage VM creation.
Before virtualization, a new server took weeks -- if not months -- to deploy because companies had to plan a budget for systems and coordinate deployment. Bringing a new workload online was a big deal that IT professionals and managers scrutinized. With virtualization, a hypervisor can allocate computing resources and spin up a new VM on an available server in minutes. Once VMs are in the environment, there are rarely any processes in place to tell if anyone needs or uses them. This results in VMs that accumulate over time and suck up computing resources, backup and disaster recovery resources.
Because VMs are easy to create and destroy, organizations need policies and procedures that help them understand when they need a new VM, determine how long they need it and justify it as if it were a new server. Organizations should also consider tracking VMs with lifecycle management tools. There should be clear review dates and removal dates so that the organization can either extend or retire the VM.
All of this helps tie VMs to departments or other stakeholders so organizations can see exactly how much of the environment that part of the business needs. Some businesses even use chargeback tactics to bill departments for the amount of computing they use.
VMs can congest network traffic
Network congestion is another common problem. For example, an organization that routinely runs its system numbers might notice that it has enough memory and CPU cores to house 25 VMs on a single server. But once IT admins load those VMs onto the server, they might discover that the server's only network interface card (NIC) port is already saturated, which can interrupt VM communication and cause some VMs to report network errors.
Before virtualization, one application on a single server would typically use only a fraction of the server's network bandwidth. But as multiple VMs take up residence on the virtualized server, each VM on the server demands some of the available network bandwidth. Most servers are only fitted with a single NIC port, and it doesn't take long for network traffic on a virtualized server to overwhelm the NIC. Workloads sensitive to network latency might report errors or even crash.
Standard gigabit Ethernet ports can typically support traffic from several VMs, but organizations planning high levels of consolidation might need to upgrade servers with multiple NIC ports to provide adequate network connectivity. Organizations can sometimes relieve short-term traffic congestion problems by rebalancing workloads to spread out bandwidth-hungry VMs across multiple servers.
Remember that NIC upgrades might also demand additional switch ports or switch upgrades. In some cases, organizations might need to distribute the traffic from multiple NICs across multiple switches to prevent switch backplane saturation. This requires the attention of a network architect who involves himself in the virtualization and consolidation effort from the earliest planning phase.
Consolidation will multiply the effect of server hardware failures
Consider 10 VMs all running on the same physical server. Virtualization provides tools such as snapshots and live migration that can protect VMs and ensure their continued operation under normal conditions. But virtualization does nothing to protect the underlying hardware. So, what happens when the server fails?
The physical hardware platform becomes a single point of failure and affects all of the workloads running on the platform. Greater levels of consolidation mean more workloads on each server, and server failures affect those workloads. This is very different than traditional physical deployments where a single server supported one application.
In a properly architected and deployed environment, the affected workloads fail over and restart on other servers. But there is some disruption to the workloads' availability during the restart. Remember that the workload must restart from a snapshot in storage and move from disk to memory on an available server. The process might take several minutes depending on the size of the image and the amount of traffic on the network. An already congested network might take much longer to move the snapshot into another server's memory.
Virtualization isn't perfect, and it creates new problems that organizations must understand and address to keep the data center running smoothly.
There are several tactics for mitigating server hardware failures. In the short term, organizations can opt to redistribute workloads to prevent multiple critical applications from residing on a single server. It might also be possible to lower consolidation levels in the short term to limit the number of workloads on each physical system.
Over the long term, organizations can deploy high-availability servers for important consolidation platforms. These servers might include redundant power supplies and numerous memory protection technologies such as memory sparing and memory mirroring.
These server hardware features help to prevent errors, or at least prevent them from becoming fatal. The most critical workloads might reside on server clusters, which keep multiple copies of each workload in synchronization. If one server fails, another node in the cluster takes over and continues operation without disruption.
Application performance can still be marginal in a VM
Organizations that decide to move their 25-year-old custom-written corporate database server into a VM might discover that the database performs slower than molasses. Or if organizations decide to virtualize a modern application, they might notice that it runs erratically or is just slow. There are several possibilities when it comes to VM performance problems.
For older, in-house and custom-built applications, one of the most efficient ways to code software is to use specific hardware calls. Unfortunately, any time organizations change the hardware or abstract it from the application, the software might not work correctly, and it usually needs to be recoded.
It's possible that antique software simply isn't compatible with virtualization; organizations might need to update it, switch to some other commercial software product that does the same job or continue using the old physical system the application was running on before. But none of these are particularly attractive options for organizations on a tight budget.
Organizations with a modern application that performs poorly after virtualization might find the workload needs more computing resources, such as memory space, CPU cycles and cores. Organizations can typically run a benchmark utility and identify any resources that are over-utilized, then provision additional computing resources to provide some slack. For example, if memory is too tight, the application might rely on disk file swapping, which can slow performance. Adding enough memory to avoid disk swapping can substantially improve performance.
Whether the application in question is modern or legacy, testing in a lab environment prior to virtualization could have helped identify troublesome applications and given organizations the opportunity to formulate answers to virtualization problems before rolling the VM out into production.
Software licensing is a slippery slope in a virtual environment
Software licensing was always confusing and expensive, but software vendors are quickly catching up with virtualization technology and updating their licensing rules to account for VMs, multiple CPUs and other resource provisioning loopholes that virtualization allows. The bottom line is that organizations can't expect to clone VMs without buying licenses for the OS and the application running in that VM.
Organizations must always review and understand the licensing rules for any software that they deploy. Large organizations might even retain a licensing compliance officer to track software licensing and offer guidance for software deployment, including virtualization. Organizations should involve these professionals if they are available.
License breaches can expose organizations to litigation and substantial penalties. Major software vendors often reserve the right to audit organizations and verify their licensing. Most vendors are more interested in getting their licensing fees than litigation, especially for first offenders. But when organizations consider that a single license might cost thousands of dollars, careless VM proliferation can be financially crippling.
Server virtualization has changed the face of modern corporate computing. It enables efficient use of computing resources on fewer physical systems and provides more ways to protect data and ensure availability. But virtualization isn't perfect, and it creates new problems that organizations must understand and address to keep the data center running smoothly.