Managing virtual data center growth and virtualization challenges

With careful testing, planning and management, virtual data center growth doesn't have to be an issue, even with problems that crop up with some virtualization technologies.

One of the biggest problems facing modern data centers is growth. The increasing demands of users, applications...

and data have strained data center resources and capital budgets. It’s easy to see why virtualization is touted as a solution to data center growth. After all, the consolidation of running more workloads on less hardware brings efficiency that would have been unimaginable just a few years ago.

But while virtualization can address some data center growth challenges, it’s not the entire answer. In fact, some attributes of virtualization technology have actually introduced new pain points that organizations sometimes struggle to rein in.

Data centers embrace virtualization to ease the proliferation and maintenance of server and network hardware and often benefit from lower power and cooling costs that accompany hardware consolidation. But virtualization deployments usually present another set of challenges that administrators must contend with.

Striking a balance
Administrators need to balance the computing demands of multiple virtual machines (VMs) hosted on fewer physical servers. Although features like live migration make it easy to move VMs across servers, it takes careful monitoring and capacity planning to avoid overtaxing a server’s available CPU cycles, memory and I/O resources.

Without the proper attention, server performance may suffer and some -- or even all -- VMs on the afflicted server might crash. To avoid that, test and evaluate the VMs to determine their resource needs and appropriate placement and data protection strategy before you deploy them to a production environment.

Resource problems are further complicated by the uncontrolled proliferation of VMs. This can easily occur in environments where several administrators can provision new VMs without careful consideration of the purpose and resources needed. If left unchecked, virtualization sprawl quickly saps important computing and data protection resources. It can also evolve into a management nightmare for overburdened administrators.

Virtual data centers also face storage issues including capacity and data protection challenges. VMs typically reside on a storage area network (SAN) and start up after loading into server memory, so a SAN will need enough additional storage to accommodate the VM images along with a library of snapshots, continuous data protection (CDP) journals, block-level incremental backup (BLIB), off-site replication and other backup technologies that support the data center.

Storage requirements are further exacerbated by faster recovery demands.

“Now, with virtualization, we’re getting pressure to have [disaster recovery] be close to real-time, like SAN-to-SAN replication,” said Todd Erickson, COO and senior vice president at First Flight Federal Credit Union in Cary, N.C.

Servers, storage and users are all connected by the network, and this is another area where virtualization can encounter potential problems. Bottlenecks can occur when multiple VMs contend for network access on the same physical server, prompting user complaints about application access or performance problems.

There are means to overcome those problems, including virtual I/O (network virtualization), multiple network cards, network port trunking and multipathing products. But those options are usually not considered until problems actually occur.

“The more layers [of virtualization] you add -- that’s just translation on top of translation -- the more you start seeing latency,” Erickson said. “Even with bonded gigabit [Ethernet] adapters, when you start running 20 or 30 servers on a physical piece of equipment, anytime there’s a high demand application, that’s just not enough.”

He said that the move to 10 GbE and other technologies like Cisco Systems’ Nexus may hold some future interest.

Perhaps the underlying cause of these problems can be traced back to improper or inadequate planning. No organization should adopt virtualization for its own sake. Instead, consider goals and determine how virtualization can help meet them. Then it’s easier to gather metrics before and after and make informed decisions about the relative success of the deployment, whether a performance problem has developed, and so on.

For example, if the goal is power conservation and power cost savings, it’s simple to identify the consumption and costs before and after deploying virtualization. Clear goals and metrics also help uncover other infrastructure choices relative to those objectives. Taking the power example one step further, it may make more sense to acquire newer and far more power-efficient servers -- at least for mission-critical VMs -- rather than reusing older, less power-efficient servers.

Virtualization issues with hardware
CPU, memory and I/O resources are still the principal limitations with server systems. There is no question that each new generation of server can support more computing resources, and more developments will undoubtedly favor virtualization. For example, processor extensions like AMD-V and Intel VT have vastly improved server performance under virtualization.

It’s important for administrators to perform due diligence with prospective acquisitions to ensure adequate performance and interoperability with management platforms and other hardware in the environment. But the value of virtualization is driving greater cooperation and collaboration among vendors.

All of your periphery hardware will take more advantage of virtualization, said Chris Steffen, principal technical architect at Kroll Factual Data Inc. in Loveland, Colo. “I would point to the unification of virtualization management platforms.”

Although servers are proving to be quite adept at handling virtualization, experts say that networks seem to be falling behind. Erickson expressed serious concern about network performance and switch backplanes saturating with the traffic from multiple VMs. Technologies like trunking and multipathing can potentially help ease bandwidth and latency problems, along with some architectural changes, to optimize the network.

Some administrators circumvent network constraints by creative load balancing -- organizing workloads so that VMs exchanging significant traffic with one another reside on the same server, leveraging the server’s internal backplane rather than moving data to and from the external network.

“The I/O never leaves the physical server,” Erickson said. “But that seems like a hokey way to manage resources with server-to-server applications because I just don’t have enough physical bandwidth between the physical servers.”

These concerns profoundly affect the way companies acquire hardware. The most noteworthy trend is away from a proliferation of small commodity servers toward larger and more powerful servers. For example, a company that needed five new servers might not have been able to afford five top-of-the-line models, but virtualization changes this. Now it makes more sense to acquire one powerful top-tier server to host 20 or 30 VMs. That one new server can typically be integrated into the existing environment with far less time and trouble than it took to integrate a larger number of commodity servers.

How do I manage so that I don't paralyze my whole organization? I can't just give everybody eight CPUs just because they want it.

--Todd Erickson, COO and senior vice president, First Flight Federal Credit Union

Experts say that equipment is often selected with virtualization-centric features, including processors with virtualization technology or multiple bonded network adapters already installed. The goal is to optimize virtualization performance in the data center or position the business to deploy virtualization effectively at a future point. The greatest threat here is to select vendor-specific features that promote unwanted lock-in that proves too costly and time-consuming to overcome later.

Virtualization and server management
A perpetual challenge with virtualization is the abstraction layer that separates a logical workload from its underlying hardware. It’s almost impossible to tell which physical server is running each virtual workload, which makes it far more difficult to intuitively optimize and troubleshoot the virtual environment.

At the same time, trouble with a physical server can impact all of the VMs running on it, and this raises the stakes for fast problem resolution and proactive prevention. As a result, virtualization has placed a new emphasis on proper server monitoring and management.

Continuous monitoring can reveal workloads that are hogging computing resources, performing poorly and would benefit from more resources, or are underutilizing resources that can be returned to a pool and shifted to more demanding VMs. An array of configurable alarms can flag an administrator and provide early intervention that will reduce helpdesk tickets and possibly avoid disastrous crashes.

There are many different management tools available today, but it’s important to select tools that you are comfortable with and that have the management features needed for your specific situation before making a final commitment to a virtualization platform.

“Your virtualization choice should be almost wholly dependent on which management tools you feel the most comfortable with,” said Steffen, adding that Microsoft and VMware tools are more than adequate to manage the environment, but comfort with the tools will have a major impact on the methods -- and overall success -- of your management efforts.

“If you’re not comfortable with the management solution, its options or how it integrates with your environment, then you’re choosing the wrong thing,” he said.

And the tools are getting better every day, adding more flexibility, more features and more interoperability with a broader array of products. Steffen said that Microsoft is actively producing management packs that allow System Center Virtual Machine Manager to manage more VMs in the network, along with other devices that are dependent on virtualization.

“You’re going to see some very significant progress in that area from VMware and Microsoft in the coming years -- even this year,” Steffen said. “As far as utilities and tools are concerned, I think that’s the area you’re going to see the greatest amount of change in the shortest amount of time.”

The message here is for administrators to take a fresh look at their virtual management tools and possibly evaluate upgrades or alternative products. Changing management tools can be painful at first, but deploying the best tool for your needs can result in more streamlined and effective management, especially with virtual data center growth.

Workflow and policy management with data center growth

Workflow and policies should also evolve with data center growth. For example, a business unit may have been charged 100% of server, software and maintenance costs for a traditional nonvirtualized server. Now a physical server may host 20 or 30 VMs and an organization has to alter its chargeback schema accordingly. This will change further as consolidation rates evolve and other technologies, such as clustering, experience more deployments.

It’s important to adjust policies and procedures that will “self-limit” the way that business units request those resources. Chargeback is the way to do that.

“How do I manage so that I don’t paralyze my whole organization?” Erickson said. “I can’t just give everybody eight CPUs just because they want it.”

With data center growth and more resources becoming economically available, along with management tools improving their accuracy and insight into utilization, the chargeback model must be updated and shared with the business units.

But the changes don’t stop with reporting and chargeback. Each organization also needs to evolve its workflow and policies to reflect changing IT behaviors. For example, provisioning a virtual environment can often be accomplished far faster and easier than traditional nonvirtualized data centers, but it’s more important for IT staff to understand the business needs of each VM and its lifecycle within the data center.

Knowing what a virtual machine is for, what resources it needs and how long it will be needed improves planning and prevents VM sprawl. But the provisioning tasks can easily be performed by less senior IT staff, giving senior staff members more time to work strategically.

Stephen J. Bigelow, a senior technology writer in the Data Center and Virtualization Media Group at TechTarget Inc., has more than 15 years of technical writing experience in the PC/technology industry. He holds a bachelor of science in electrical engineering, along with CompTIA A+, Network+, Security+ and Server+ certifications, and has written hundreds of articles and more than 15 feature books on computer troubleshooting, including Bigelow’s PC Hardware Desk Reference and Bigelow’s PC Hardware Annoyances. Contact him at [email protected].

What did you think of this feature? Write to SearchDataCenter.com's Matt Stansberry about your data center concerns at [email protected].

Dig Deeper on Virtualization and private cloud

Cloud Computing
Sustainability and ESG