iQoncept - Fotolia

Tame the virtualization infrastructure management beast

IT administrators must manage virtual infrastructure requirements, such as patches, reboots and component lifecycles, to create and maintain an efficient data center.

Careful virtualization infrastructure management can enable IT administrators to better plan how and when to reboot, patch and deprecate each component, and it can ultimately help them wring as much efficiency as possible from the technology they have.

Virtualization infrastructure management is a core task that requires regular upkeep, but when done well, it goes almost entirely unnoticed.

Spinning up instances in a virtual environment is a relatively simple process, but it can invite management problems. It's easy to let VMs sprawl, updates languish and reboots lag. As a result, many admins waste time chasing problems that their policies should have prevented.

Most admins know they shouldn't manage virtual infrastructure requirements haphazardly, but it's difficult to avoid if they've inherited an infrastructure that is already set on a particular course. Admins must establish plans and policies to ensure that good virtualization infrastructure management habits don't fall by the wayside.

Plan virtualization infrastructure management around reboots and patches

Virtualization infrastructure management requires admins to shift from maintenance based on the built-in reboot schedules of physical servers to a deliberate IT maintenance plan built on the specific needs of virtual components.

Hypervisors require less maintenance than traditional OSes because of their lightweight code base, but the hypervisor's many dependencies mean that those infrequent reboots are even more important. A strict reboot schedule -- on a timeline of months rather than weeks -- can address memory leaks and runaway processes that may eventually cause major outages. Reboots keep hypervisors, as well as virtual guests and virtualization hardware, working at peak efficiency, and admins should prioritize this when creating a reboot schedule.

Infrastructure maintenance tasks
Figure A. Follow this checklist to ensure infrastructure upkeep.

Admins should apply patches in different windows, but with similar dependencies in mind. For example, in a VMware environment, patches and updates require an infrastructure-based strategy rather than a traditional OS software strategy. VMware patches resemble firmware updates more than software updates because virtualization affects a wider area than regular software.

If admins consolidate infrastructure updates into one window, the downtime will be longer and the recovery process more difficult if a core product update fails. A piecemeal approach to updates demands more steps, more paperwork and more frequent, shorter downtimes, but it can enable finer levels of control. Admins can start with central infrastructure and safer updates, such as hosts, before they proceed to more complex products, such as VMware NSX or vCenter.

Restrain virtualization sprawl

Virtualization enables the rapid creation and deployment of virtual instances, but if admins leave this process unchecked, excess instances can strain the data center's efficiency. Admins can build several tactics into their virtualization infrastructure management plans to mitigate or prevent VM sprawl.

VM resources correspond with host server resources, and if VMs don't consume those resources evenly, then a host server might run out of one resource, such as memory, and leave other resources, such as CPU cores, unusable. Sprawling VMs -- which consume resources indiscriminately and unevenly -- can make it seem necessary to purchase additional servers.

But more isn't always better. Many applications run as well on 2 CPUs as 16 CPUs, and virtualized hardware can show that applications don't actively use all the memory they have cached. Admins must resist the temptation to throw resources at a problem because this tactic is more likely to lead to sprawl than to efficiency.

Balance is more difficult if inexperienced or irresponsible staff request more resources than they can manage.

Admins can also use policies to control who can request virtualized instances and how long they can access those resources before admins review and recover them. Admins can use tools such as VMware vCenter Server and VMware vRealize Suite to deploy these policies and manage their lifecycles. More advanced tools can use automation and orchestration to standardize requests and queue unused instances for removal.

Balance is more difficult if inexperienced or irresponsible staff request more resources than they can manage. Admins can employ a chargeback model to make different departments pay for the VM instances they use.

If VM sprawl is already present, admins must hunt down inactive VMs. These VMs -- sometimes called zombie VMs -- will continue to consume resources without offering any value.

Find zombie VMs
Figure B. Find abandoned VMs with these key signs.

Admins can use VM tags to sort VMs by name, owner and validation date, which can reveal inconsistent, errantly created VMs. Admins can then shut down the VMs to ensure no one claims them before they are deleted.

Manage the virtualization infrastructure lifecycle

Organizations sometimes ask IT to set up projects that the original infrastructure was never meant to support. Virtualization enables some measure of flexibility to expand for larger deployments, but admins will eventually have to address the deprecation and replacement of IT infrastructure components.

Often, admins want to replace everything and start from scratch. This is practical for purely virtualized components, but it is rarely possible for associated hardware, such as hosts and storage frames -- all of which can incur additional costs or create system outages if absent.

In some cases, expanding the infrastructure is the cheapest option to support new projects. Most infrastructure components are inherently capable of expansion, but they all have limits. If admins don't have long-term plans, these components will eventually hit their limits and leave admins without the resources they need.

Infrastructure redesign is more complicated, but if the application needs drive new plans and admins have the space and budget needed for the implementation, it can offer significant improvements. A thorough redesign will pinpoint the hardware that needs to be replaced so admins can identify bottlenecks and pain points. To select new hardware, admins should consider servers best-suited for virtualization that also offer flexibility for future projects.

Admins can still make use of extra hardware that can't run critical applications anymore. Extra hardware can buttress an infrastructure with redundancy by backing up important applications and servers or offering space for utility VMs and management clusters. Otherwise, admins can recycle used server equipment or donate it to a technical college. However, they must be sure to make the data unrecoverable by scrubbing the drives or smashing the spindles.

Dig Deeper on Containers and virtualization

Software Quality
App Architecture
Cloud Computing
Data Center