Hyper-converged infrastructure enables new levels of workload separation, but the recovery and performance benefits of mixing test and production environments still outweigh the advantages of isolated clusters.
Virtualized workloads are standard in the modern data center. There's virtualization in core production systems, test and dev environments, and virtual desktops. Each workload typically has its own resource needs and can affect your infrastructure differently.
For example, a virtual desktop infrastructure (VDI) can zap storage and cause I/O spikes. When you couple the performance aspects with the critical nature of your production environment, it's tempting to separate your test and production environments from each other and other workloads.
Using separate clustered hardware environments enables you to stage one environment geared for heavy I/O -- your production environment -- and create another environment to support a workload with a more demanding CPU need -- your test and dev environment. This ensures that each workload has the resources and availability it needs based on its criticality.
Hyper-converged infrastructure provides the ideal platform for creating these silos for your various test and production environments. You can set up multiple hyper-converged infrastructure appliances and have one dedicated to VDI and another to your core production environment. These mini-clusters create an ideal separation for workloads, but the approach has some downsides.
Disadvantages of clustered test and production environments
Hyper-converged infrastructure enables you to tune infrastructure and software according to the workload's needs, but it comes at a cost -- and not just the price tag.
Mini-clusters of hosts dedicated to a single purpose have limited failover capacity. You should design all the clusters with an N-1 redundancy so they can handle the failure of a single host. The cluster could be 4 nodes in a hyper-converged infrastructure frame or 15 nodes in a traditional rack deployment; both can support the loss of a single host, but they have different restart times.
If you have 50 VMs per host in both the 4-node appliance and the 15-node rack, the 4-node system must restart 50 VMs across 3 hosts, or just fewer than 17 VMs per host. In the 15-host situation, you're restarting 50 VMs across 14 hosts, or just fewer than 4 VMs per host. That is a pretty big difference in recovery time.
When it comes to recovery, larger clusters make a difference, but the criticality of the workload also plays a role. Clusters should be designed with an N-1 aspect, but what happens when you lose more than one host?
In a pure production environment, losing two hosts can cause a scramble for resources as the VMs come back online. If everything on that cluster is mission-critical, there are no workloads you can bump to speed the recovery of the most important VMs.
If your test and production environments share the same hosts, you can shut down the test and dev workloads to give additional resources to your production systems. This can keep core systems online and give you additional time to recover the failed hosts.
Besides the recovery reasons to mix workloads, the performance aspects can also help you optimize your hardware platforms. Mixing high-CPU workloads with intense memory workloads can prevent workloads from taxing one element of the infrastructure. This enables you to take a more balanced approach when buying your hardware platforms instead of having to scale up one element of the hardware platform for a specific workload need, which can be difficult to do with hyper-converged hardware platforms.
Mix environments to get the best of both worlds
At first glance, the needs of a VDI still point toward separation being the better option despite the recovery and performance advantages of mixed test and production environments. Separating workloads ensures that a resource-hungry VDI environment doesn't cause issues with core production. But hypervisors enable you to restrict and reserve resources or resource pools to support VDIs. VDI and production workloads are subject to rules that you can create and control.
A hyper-converged infrastructure platform might change how you adjust the rules in relation to its various features, but it shouldn't stop you from mixing environments. It takes additional effort to create the rules and guidelines, but once you do, the workload balance becomes hardware-independent.
As you replace infrastructure, you don't have to go through the same effort trying to balance the hardware platforms when you repurchase infrastructure. This gives you all the benefits of mixing production with test and dev environments along with the stability and portability of the rules and restrictions that your virtual controls give you.
This logic applies to both a single hyper-converged infrastructure appliance and multiple hyper-converged clusters as long as the restrictions on the rules are properly set. It might be ideal to limit I/O on a storage pool that has VDI on it to prevent issues during a boot storm or to limit memory usage during the same boot storm. Test and dev environments might have rules that prevent excessive use of memory or CPU during peak production hours or during a failover.
By mixing the workloads on a hyper-converged infrastructure platform, you can squeeze the lower tier resources in favor of more production resources. You might, for example, reduce CPU time on a VDI environment in favor of production during a failover event. You can use these types of rules to preserve your core production at the expense of less important functions.