Hunt zombie VMs and control virtualization sprawl

VM tagging is an effective tactic for discovering and eliminating zombie VMs. Cut useless VMs, redundant data and confusing configurations to enable efficient resource consumption and limit VM sprawl.

Data centers generally aren't reminiscent of post-apocalyptic wastelands, but the two have at least one thing in common: zombies.

Zombie VMs are abandoned virtual machines that can devour a data center's resources. However, with VM tags and usage monitoring, IT staff can hunt zombies and control virtualization sprawl.

zombie VMs, zombie data and zombie infrastructure components all consume resources or cause confusion without offering anything in return.

Despite their relative abstractness, VMs and virtual resources aren't free. Software-defined technology still requires resources, even for abandoned data. IT budgets are tight and businesses can't afford waste.

Eliminate current zombie VMs by studying usage, and then establish a VM tagging policy to ensure annual inspections can keep the data center free from virtualization sprawl.

Hunt zombie VMs with VM tags

Zombie VMs tend to be a result of human error and forgetfulness. In a modern deployment, automation rapidly creates VMs in response to changing IT and business needs, and it's easy to lose track of them. The challenge is to find abandoned VMs, especially when hunting in a large infrastructure.

How to find zombie VMs

Implementing VM tags upfront can make finding them easier later on. IT administrators can use VM tags to apply notes to each VM after its creation. These tags then enable IT admins to identify the purpose of each VM. Admins can then search for these tags and quickly find possible zombies.

This method is ideal, but it requires work ahead of time. If the zombies are already inside the data center, IT staff will need to look for other evidence to find and eliminate them.

Unusual usage might signal an infestation

Usage provides other signs of zombie VMs and virtualization sprawl. CPU, memory and networking can signal the presence of zombie VMs. If the IT staff finds VMs that consume no resources, it should mark them for further investigation.

Don't delete or disconnect the VMs right away; not all workloads are in use every day. Backup active directory controllers, domain name system servers and licensing servers are all common critical servers that can appear to be doing nothing. Make a list of the servers to examine after an initial survey.

Look for a few key tells. A range of server names could indicate a deployed group that never sees use. Unpatched services could indicate abandonment. Look at server logs to see when someone last logged in or accessed the server remotely.

If all signs indicate an abandoned server, ask other users who might know about it. If no one responds, test it. Always start small and disconnect it from the network for days or even weeks to see if anyone complains. If no one notices its absence, shut it down, move it to disk and wait.

Combatting virtualization sprawl is a time-consuming process that requires patience between steps. Don't risk the destruction of any critical data.

Undead data and infrastructure

Zombie data is dead, purposeless data that still takes up space in the infrastructure. Unlike the plentiful space in physical infrastructure, overconsumption can lead to scarcity in a shared virtual space.

Duplicate data is a primary culprit, but patches, ISO files and updates can all live on as dead data after application to virtual servers. For example, a database administrator might have downloaded a 2 GB patch file on the database server and left it on the desktop or in the recycle bin after using it.

Clearing dead data from VMs can be a tough, manual process, but it's worth the effort.

This dead data can grow and have a huge effect on what occurs downstream with snapshots, backups and disaster recovery plans. Clearing dead data from VMs can be a tough, manual process, but it's worth the effort.

Zombie infrastructure can cause inefficiencies and confusion. Virtual networks, port groups and virtual LANs are easy to miss because they're simple to create and don't hog resources. Remove old network configurations to eliminate complexity.

VCenter reporting tools enable the verification of network activity so the IT staff can check resources before removing them. The IT staff can also rename the networks before removal to make sure nothing tries to connect to them in the future, but don't let this prevent proper housekeeping.

Zombies in different forms infect data centers and either take up resources or cause confusion. IT staff must deal with them to correct existing problems, make virtual infrastructure efficient and limit virtualization sprawl. After taking out zombies, IT staff can use VM tags to indicate the date of a VM's last inspection.

Plan an annual VM inspection after establishing a baseline. No staff can entirely eliminate human error, so prevention and periodic examination is the best strategy to keep zombies at bay.

Dig Deeper on Containers and virtualization

Software Quality
App Architecture
Cloud Computing
Data Center