Gajus - Fotolia
A disaster recovery plan is something IT admins know they need, but hope they never have to use.
But if disaster does strike, there are some tips and tricks IT operations teams can rely on, using automation and IT asset management data, to ensure a DR plan works as intended.
Ensure CMDB data accuracy through automation
Data accuracy poses one of the biggest challenges during the development and implementation of a DR plan. And, because IT teams often have to enact these plans in times of stress and panic, there's a good chance they might miss or overlook potential data quality issues.
Even with an automated failover process in place, there's a risk of inaccurate data -- especially if at some point admins had to manually record critical system information, such as IP addresses or domain name server entries, and configuration changes. Nobody wants to troubleshoot a fat-fingered IP address after the main DR-enabled system didn't come up as expected.
The automated creation of critical configuration management database (CMDB) records through a well-designed and verified process reduces human errors and inaccuracies in a DR plan. In a CMDB, maintain all copies of new IP addresses and DNS entries for infrastructure in a failed-over state.
In addition, automate the creation of recovery groups -- groups of servers the support failover processes to meet an application's DR requirements -- so that they pull information from the CMDB. This ensures that, as long as the failover IP address is correct in the CMDB, the data in the recovery groups populates correctly.
Most modern DR products include an API for programmatic management of these processes. Tools, such as PowerShell, also enable administrators to automate the task of checking DR configuration data against CMDB data. This highlights errors and provides a daily sanity check -- delivered straight to an admin's inbox. Data can, and does, change over time, so these checks for quality and accuracy are critical.
Admins can also use and cross-check multiple data sources to achieve a higher confidence level in DR data integrity.
Potential challenges and limitations
None of these practices negate the need for thorough failover tests at scheduled intervals. A virtualized DR environment enables IT teams to test at will and in isolation without interruptions to normal operations.
While the automation of the actual failover process is important, there are limitations to what admins can do. Perform as much setup as possible ahead of any actual failure to save time and streamline operations. For example, pre-configure DNS entries in a consistent format -- such as myserver-dr.mysite.com -- to denote they are DR entries.
In addition, be careful that automated processes don't break any configuration that needs, and expects to find, certain DR entries. That potential breakage might outweigh the usefulness of the CMDB entries, but there are ways to address this. For example, using automation and APIs makes it possible to update DNS entries in DNS infrastructure. Also, avoid hardcoded IP addresses.
Make a CMDB database DR-ready
CMDBs can contain much more than IP addresses and DNS entries for DR purposes. Fold in additional information, such as key contacts and support groups, as well as notifications and failover reports. Most CMDB systems support the addition of fields that IT teams can choose and manipulate to meet their needs.
Lastly, ensure the CMDB is highly available. If admins rely on CMDB data during automated failover, and the database is unavailable, many issues, including failure, can arise. Take all steps to ensure CMDB data, as well as systems like Active Directory and the DNS, are available in all disaster scenarios.