Organizations rely on their data centers to support most of their business, which means that even a single hour of outage can do significant damage. Whether your organization uses a dedicated data center or a colocation facility, ensure you have the right people, processes and strategy to maintain uptime.
To build and maintain a truly effective and scalable data center, you must have the right people to monitor and manage it. A data center with poorly supported facilities can quickly undermine the best-laid plans, putting your people, infrastructure and business all at risk. If you have the opportunity to build your own data center facility team, ensure the people you select have the right mix of expertise, skill and training to keep your infrastructure and facilities running smoothly.
Why people matter in data center facility management
Data centers' inherent complexity makes them likely to suffer some sort of outage caused by human error --70%-75% more likely than other business facilities, according to some studies. Eliminating human error and ensuring adequate staffing levels require hiring and investing in skilled, team-oriented people to install, maintain and operate your data center.
Data centers can be challenging environments to manage because of the types of infrastructure they use -- such as VMs, virtual storage and virtual networks -- the workloads they handle daily and the IT maintenance cycles they undergo. These challenges require careful coordination and planning with multiple teams, including the facility team, infrastructure owners, business owners and users and upper management.
With so many moving parts and invested stakeholders, you must ensure good communication and active involvement between all parties. Having the right facilities team and strategy can make all the difference, because ultimately the facility affects the rest of the company. You can take four concrete steps to create an effective, scalable and valuable data center facility team for your organization.
Step 1: Document your strategy
Managing and operating a mission-critical facility requires the facility's entire team to thoroughly understand expectations. The first step to getting the facility team on the same page is to have a documented data center management strategy. This should outline your facility's infrastructure requirements and the availability of services in your facility, such as environmental health and safety, energy management, emergency preparedness and ongoing training.
To ensure the strategy is as effective as possible, consider adding these processes to it as well:
- Regular facility inspections ensure everything remains in proper working order. Check the generators, water temperature, fuel levels and electrical and mechanical distribution systems.
- Continuous systems testing keeps crucial systems operating within stated and safe parameters. This includes regular load testing and backup or failover testing.
- Predictive maintenance activities can identify any changes, trends or irregularities in operations that could precede potential failure. Predictive maintenance gives facilities staff a chance to address issues before they become a problem.
- Preventive maintenance processes and procedures keep mission-critical systems running smoothly and reduce expensive ad hoc fixes. Follow manufacturer guidelines regarding when you should perform these activities to keep systems healthy.
- You should perform corrective maintenance activities, such as repairing or troubleshooting system or component failures, when an item fails outside of its regular maintenance schedule. This includes facility upkeep such as fixing a leak or replacing a faulty HVAC part.
Step 2: Create a well-rounded team
Your data center team should include experts in facility-specific domains such as electrical, mechanical and operational controls; fire detection and suppression; quality management; building management systems and personnel training. If this team does double-duty as your infrastructure or IT support team, they must also understand digital maintenance software systems such as DCIM products.
Your facility team must remain up to date on industry trends and changes that can affect their work, so prioritize training even in the long term.
Step 3: Develop a unique staffing model
Your company should develop a staffing model specific to its own facilities, business functions and operational mandates. Consider coverage requirements for support -- such as whether you have daytime business hours only or operate 24/7 -- emergency response needs, maintenance activity workload, project supervision requirements and your operations budget.
Regular analysis of your facility maintenance scope determines staffing requirements for those activities, which can help you better allocate and forecast resources and budget. Ultimately, your systems' mission criticality and the cost of downtime drives your coverage.
Step 4: Define and document roles and responsibilities
Many companies include role and responsibility documentation in their data center personnel management. Consider the wide variety of roles and the many people involved in running a data center facility, from the facility managers and data center admins to the stakeholders and those affected by changes in the data center.
Clearly define each individual's roles and responsibilities on the facilities team and the other teams they're involved in. Well-defined descriptions provide the benchmarks needed to evaluate skills and performance and set goals for growth and training. Plus, it helps your organization avoid collaboration issues where information gets isolated within certain groups, or people work outside of their domains.