preventive maintenance

Preventive maintenance (also seen as preventative maintenance) is the practice of routinely taking measures in hardware administration that reduces the risk of failures and improves the likelihood of quick recovery in the event that a failure does occur.

Server failures, for example, are usually the result of failed hard disk drives, power supplies, RAID adapters, motherboards, RAM or CPU. Preventive maintenance starts with the purchase of good-quality equipment in the first place. It’s advisable, for example, to buy a dedicated server rather than to use a personal computer as a server. Dedicated servers often have features built in to make the systems more robust, such as dual power supplies, hot-swap capability for PCI slots and fault-tolerant RAM. Similarly, it’s important to use disk drives that are engineered to withstand enterprise-level demands and high-end surge protectors or uninterruptable power supplies (UPS).  

In terms of ongoing maintenance, it’s important to clean servers and air inlet filters regularly to ensure that metallic particles don’t get into the system. Covering any openings left when boards are removed can keep out larger things, like insects and mice. Proper ventilation is crucial and a server room should have adequate cooling to maintain an ambient temperature of no more than 70 degrees.

To improve the likelihood of quick recovery when failures do occur, the most effective measure is redundancy: maintaining a replacement server or server components that can be swapped out in an emergency.

This was last updated in October 2016

Continue Reading About preventive maintenance

Dig Deeper on IT Systems Management and Monitoring