Arjuna Kodisinghe - stock.adobe.
Five-nines availability -- or 99.999% -- is the percentage of time a network component or service is accessible to a user in a given period, usually defined as a year.
The migration from private networks to cloud services has led companies to demand that service providers offer five-nines availability. Organizations are continually adding more mission-critical applications and services. It is essential these services are highly available and that minutes of downtime are kept to a minimum. When resources aren't accessible, employees, customers and supply chain partners can no longer access the information or services they need.
Availability of five-nines and other uptime percentages
Although 100% availability is the goal, it is unreasonable to expect a service will be available every minute of every day throughout the year. Maintenance, upgrades and uncontrollable events -- or acts of God -- make it impossible for a provider to guarantee 100% uptime. A five-nines availability service-level agreement (SLA) is close; it mandates that a given service will be unavailable for no more than 5 minutes and 15 seconds a year. Services covered by an SLA with four-nines availability -- or 99.99% -- could be unavailable 52 minutes and 36 seconds per year. Three-nines availability -- 99.9% -- allows 8 hours and 46 minutes of downtime per year.
Maintaining service availability percentages with five-nines requires significant investment and upkeep, using established network configuration, monitoring and troubleshooting networking issues, and following best practices to ensure system components remain operational. Every hour a service is not available can cost a company millions of dollars.
Achieving five-nines availability
How do you get more nines? Consider these steps:
- Buy the best equipment that's the easiest to repair. Then, add load balancing, failover and redundancy. Highly available systems often include power supplies and processors, battery backup, diesel or natural gas generators for longer power outages than batteries can handle, multiple diverse communication lines and multiples of whatever else is likely to fail.
- Automate, where possible, to monitor network performance and flag potential malfunctions. Automation tools, along with network analysis software that continually tracks the health of network components and technologies such as AI and machine learning, can help operators reduce the chance for human error and ensure their networks remain operational. Additionally, AI and machine learning platforms can proactively alert network operators in the event of network problems or a security breach and can automatically shift operations from failing components to backups when necessary.
- Pay attention to software. Out-of-date or unpatched software can make five-nines availability impossible. If a particular component fails because of a faulty OS and takes a long time to get back online, availability will suffer.
- Test backup and disaster recovery plans to make sure they are sufficient.