What are some common VMware High Availability configuration errors?
What are the most common mistakes people make when configuring VMware High Availability?
First and foremost, making the assumption you will be able to implement VMware High Availability (HA) without training -- or at least reading the manual – is an obvious mistake. Beyond that, some of the most common errors VMware administrators make include:
1. Not using identical host hardware in the VMware vSphere HA cluster. Use of different host hardware can, and often does, lead to an imbalanced cluster. By default, VMware HA prepares for the worst-case scenario of the largest, most powerful host in the cluster failing. To be able to deal with that failure, more resources from the other hosts in the cluster have to be reserved, making those resources unavailable.
2. Allowing cluster host inconsistencies that prevent a virtual machine (VM) from being started on any cluster host. Users often neglect to mount data stores to every cluster host. This makes booting the VMs from cluster hosts that cannot see those specific data stores impossible. Another common inconsistency is an incorrectly set up Distributed Resource Scheduler (DRS) with flawed VM to host affinity rules.
3. Making a vSphere cluster that is too small. The way vSphere HA works is that every host in the cluster has to reserve a portion of their resources to handle a host or node failure. A 12-node cluster would have to reserve 1/12th of the entire cluster's resources to handle the failure of a single node. If HA is required to protect against two nodes failing simultaneously, then 1/6th of the cluster's resources must be reserved. Setting up a smaller cluster can hamper the ability of that cluster to tolerate nodal failures. Larger clusters are a lot more tolerant of failures.
4. Not protecting the vSphere vCenter instance. This is one of those Homer Simpson "D'oh!" moments. It seems obvious, and yet it's a common mistake.
5. Not enabling the network switching PortFast option. This can create the user impression that the failed node VMs have not come up because it takes so long for the VMs to regain their network connectivity.
Frequently asked questions about VMware HA
Guide to VMware High Availability
VMware High Availability positives and negatives
Dig Deeper on Disaster recovery facilities and operations
Related Q&A from Marc Staimer
NFS vs. CIFS vs. SMB: What are the differences?
There are stark differences among file sharing protocols NFS, CIFS and SMB. Compare access, application deployment, configuration and security, among... Continue Reading
How do storage scalability and elasticity differ?
Storage elasticity is less well known than scalability, but it helps admins with efficiency and cost. Both elasticity and scalability are key to ... Continue Reading
Why are object stores a good match for archival data storage?
Latency in object stores that stems from a large amount of metadata means the technology is better suited to non-transactional data. Continue Reading