First and foremost, making the assumption you will be able to implement VMware High Availability (HA) without training -- or at least reading the manual – is an obvious mistake. Beyond that, some of the most common errors VMware administrators make include:
1. Not using identical host hardware in the VMware vSphere HA cluster. Use of different host hardware can, and often does, lead to an imbalanced cluster. By default, VMware HA prepares for the worst-case scenario of the largest, most powerful host in the cluster failing. To be able to deal with that failure, more resources from the other hosts in the cluster have to be reserved, making those resources unavailable.
2. Allowing cluster host inconsistencies that prevent a virtual machine (VM) from being started on any cluster host. Users often neglect to mount data stores to every cluster host. This makes booting the VMs from cluster hosts that cannot see those specific data stores impossible. Another common inconsistency is an incorrectly set up Distributed Resource Scheduler (DRS) with flawed VM to host affinity rules.
3. Making a vSphere cluster that is too small. The way vSphere HA works is that every host in the cluster has to reserve a portion of their resources to handle a host or node failure. A 12-node cluster would have to reserve 1/12th of the entire cluster's resources to handle the failure of a single node. If HA is required to protect against two nodes failing simultaneously, then 1/6th of the cluster's resources must be reserved. Setting up a smaller cluster can hamper the ability of that cluster to tolerate nodal failures. Larger clusters are a lot more tolerant of failures.
4. Not protecting the vSphere vCenter instance. This is one of those Homer Simpson "D'oh!" moments. It seems obvious, and yet it's a common mistake.
5. Not enabling the network switching PortFast option. This can create the user impression that the failed node VMs have not come up because it takes so long for the VMs to regain their network connectivity.
Frequently asked questions about VMware HA
Guide to VMware High Availability
VMware High Availability positives and negatives
Dig Deeper on Disaster recovery facilities and operations
Related Q&A from Marc Staimer
Storage elasticity is less well known than scalability, but it helps admins with efficiency and cost. Both elasticity and scalability are key to ... Continue Reading
Network File System and Common Internet File System/Server Message Block were designed to work with any operating system, but NFS remains dominant in... Continue Reading
Latency in object stores that stems from a large amount of metadata means the technology is better suited to non-transactional data. Continue Reading