IT operations managers regularly evaluate their organization's data center and IT infrastructure. IT performance measurement requires an ops team to make cohesive plans to test the infrastructure, assess its technical performance and make recommendations for optimization. Implemented properly, a performance review's results provide a roadmap for infrastructure improvements.
There is no single approach to these reviews. Consider the goal, set the scope of measurements and then organize the tools, tactics and metrics that drive a meaningful result.
When and why to perform IT performance measurement
No formal standards dictate approach or timing to measure IT performance. An array of powerful monitoring and reporting tools offers real-time insight into hardware and software performance. But these insights are intended to be tactical -- moment by moment, 24 hours a day. They show how each VM, server, disk and other resource operates -- and when those resources fail.
A performance review serves another purpose, akin to an employee review. The review is not a summary of the employee's missteps on any given day but rather an objective assessment of the employee's capabilities and goals over time.
Likewise, the nexus of an IT performance review is the periodic assessment of the infrastructure's efficacy and opportunities to improve it. The review evaluates factors such as performance, reliability, capacity, architecture and configuration, as well as recognizes any important limitations. Reviews offer a strategic analysis that can inform high-level business tasks, such as planning and budgeting, improvement plans and investments in new technologies.
For example, an insurance organization measures IT performance and identifies aging servers with costly maintenance that would benefit from replacement. The replacement servers offer more capacity to handle more workloads, enabling a business project to launch.
The goals and benefits of the review are as diverse as the organizations that conduct them. Do them as often as necessary to help guide business decisions. Some organizations measure IT performance every few years as a prelude to a regular technology refresh cycle. The focus here is to move to more reliable and cost-effective gear or to prepare to adopt new technology. But it's reasonable to conduct reviews annually or even more frequently. In general, large organizations with fast-changing business needs benefit from more frequent checks.
A critical business event or tech advance also can trigger an evaluation of IT systems. For example, mergers and acquisitions and new business units might require additional infrastructure and applications. Issues or faults, such as a chronic availability problem, that become viable concerns can prompt a look at IT as well. And major technological advances can lead to an evaluation of existing resources to measure the effects of an upgrade.
IT performance measurement processes
The basic process to evaluate IT systems generally follows this series of five steps.
Set the goals. Focus the review on data that informs the decision you need to make, whether it is to update a tool or simply reduce IT costs.
Define the scope. While it's possible to measure the performance of every hardware and software element, such comprehensive efforts require substantially more time than a focused review -- and likely aren't worth it. The review's goals determine its scope. For example, if the goal is to assess and reduce storage costs, then limit performance measurements to storage systems, such as disk capacity, and related resources, such as an automated data tiering tool.
Gather data. Once IT operations managers know goals and scope for a performance measurement endeavor, they must gather the data that drives this assessment. Turn to existing systems management and monitoring tools that generate a wealth of metrics and derive meaningful key performance indicators (KPIs) from systems and applications. When metrics and KPIs are not readily available, IT staff should implement tests and calculate KPIs to support the review. For example, in a review aimed to reduce support costs, admins must understand KPIs such as the number of incidents, which systems or applications are involved, the ratio of solved to escalated incidents and how long each incident took to solve.
Analyze and assess. Map the metrics and KPIs to the goals, and make objective determinations about the state of the infrastructure, systems and applications that operate within the scope of the review. Compare the current data to similar data derived from previous reviews to determine how performance has changed, if possible.
Make recommendations. Analyzing metrics, KPIs and other results from the IT performance measurement period often leads to a series of actions or project recommendations. Recommendations usually address the issues identified, or opportunities uncovered, by the assessment. For example, a significant uptick in users or help desk incidents for a business-critical application compared to a review six months prior demands closer investigation of the application's functionality and leads to projects such as an application migration to a larger VM on a newer server or cloud host, additional network bandwidth provisioning to support more user traffic or an application cluster to improve availability without single points of failure.
Mitigate disruptions from performance reviews
Reviews are generally unobtrusive. Systems management and monitoring tools generate most routine data used in infrastructure analysis and assessment nonstop, and this information feeds long-term trend data. The measurement on its own is not a troubleshooting or remediation exercise -- although it often spawns actions and projects for IT as a result. Expect those to be the real disruptors.
There can be additional work involved for IT operations, however. Some IT performance measurements necessitate point tools or ad hoc system checks to gather specific metrics or other details. For example, to stress test a server, the IT operations or application support team must migrate its live workloads elsewhere. Depending on the infrastructure, a hosting migration can affect application users.
Several simple strategies help minimize disruptions when system performance testing is conducted in a production environment.
Metrics that matter
The metrics generated in everyday IT operations -- network latency, application response times and others -- do not immediately translate to business insights. System performance reviews require a level of metrics and KPIs that connect IT behaviors with business issues. These three areas show how IT operations tie into spending, productivity and business agility.
Financial issues relate to the cost of an application's or system's operation and maintenance. For example, the organization might discover that a service costs far more to run than expected and replace it with a SaaS offering.
Delivery issues -- those that affect productivity because the systems are down or slow -- factor heavily into many organizations' IT reviews and should be measured periodically. Example KPIs include support issues per month, mean time between failures and mean time to repair.
Opportunity issues pertain to investment in ongoing operations versus in innovation or growth. Information about ongoing operations efficiency can lead to changes in IT to support agile business moves.
Limit the scope of the review to reduce the number of systems and applications that undergo direct testing. Practice requisite skills, such as VM migration, in a lab or evaluation environment. Organize backups and other workload protection plans prior to performance tests. Finally, communicate the plans throughout the organization. Perform testing in time windows that affect the smallest group of users, and inform them when disruptions are possible.
Periodic IT performance measurement enables organizations to evaluate the current state of apps and environments, compare it with past data, recognize opportunities for meaningful improvements and chart the best path forward to benefit the business. Although often challenging and time-consuming, this process facilitates vital analysis that even the most intelligent systems management frameworks cannot provide. IT operations managers are typically called upon to plan and implement these reviews, making it a skill set worth cultivating.