Sergey Nivens - Fotolia
A configuration management database has played a role in automated failover for years. More and more companies use a CMDB to map out information, such as system dependencies, for disaster recovery purposes.
If a failover process uses automation and relies on the information contained in the CMDB, IT operations teams must consider the domino effect a CMDB failure might unleash.
Each database and configuration will be different from the next, so there is no one-size-fits-all approach to CMDB high availability. Explore the approaches discussed here to optimize a CMDB and ensure it's ready to go in a time of crisis.
Take a balanced approach
Ensure physical redundancy at all levels to keep the system available. That means a load-balanced approach, with multiple front- and back-end servers to handle a host failure -- consider a classic two- or three-tier infrastructure, which ensures no single point of failure. This infrastructure can be virtualized, but the same availability rules, such as not putting VMs on the same host, apply. Losing an entire cluster could be an issue, but that is a rare event.
Apply special care to key components of the infrastructure -- load balancers, network connectivity, etc. -- that must support the failover. IT teams should be able to initiate a DR environment from either side -- the production site or hot DR site -- of the infrastructure.
One way to ensure high availability is to maintain a redundant copy of configuration management data at the recovery site. There are many ways to implement this: It could be a read-only replicated copy, but should have the information required for failover, such as unique identifiers and system dependencies. How you connect into this, and perhaps when, is a question for each organization.
Plan a course of action in case the CMDB fails. With the ability to fail over the CMDB without dependencies on the hot site, IT operations teams can still access the CMDB to enable failover of other items.
A good DR plan is able to fail over infrastructure without the CMDB being present -- but, in that case, automated failover becomes a manual process. For any major DR scenario, the company should have a list of priority systems and their dependencies. Act on this list first to help minimize the effects of the disaster.
Ensure data and system integrity
While it is quite straightforward to create the required availability across IT infrastructure, it's far more complex to ensure its integrity -- the accuracy and quality of the data. For example, ransomware that hits a server can cause a nightmare. Depending on that system's configuration -- i.e., whether its failover can occur without CMDB access -- the DR process might be impeded, and the issue exacerbated.
Pay special attention to the security of the servers involved. Follow vendor best practices and disable unused services -- because disabled services can't be exploited. This will vary by both vendor and OS. If a CryptoLocker virus affects the DR infrastructure, even with a CMDB, the migration might not be possible.
Resources and performance
Two items that deserve more attention are resource availability and performance. While automated failover tests might be fairly simple and straightforward, they use a smaller pool of resources, so ensure resource availability and performance that can also support large-scale failovers that use the CMDB. If not, there could be a self-inflicted denial of service caused by masses of requests to the CMDB, which therefore compromises the DR team's ability to perform a good recovery. Plan capacity for worst-case scenarios.
In short, a CMDB is useful for automated failover. However, should something along the lines of a natural disaster or power outage take out the entire infrastructure, not having the previously available CMDB information and infrastructure complicates the DR process.