Taming data center disturbances with vRealize Operations Manager

As more workloads and servers are virtualized, the vSphere administrator will need some kind of tool to help keep an eye on resources and make intelligent choices.

While virtualization has delivered flexibility and efficiency to the data center, it has also intensified the need for careful monitoring, management and automation. VMware shops can select a wide range of management platforms, but VMware's vRealize Operations Manager -- formerly vCenter Operations Manager -- supports mixed environments that rely on vSphere, Hyper-V and Amazon Web Services.

Automation and workflow management tools such as vRealize Operations Manager (vROps) provide a powerful array of features, but they can be challenging to setup and use properly. One of the features is the alert thresholds function. There are times when the thresholds should be static, others when they should by dynamic -- and other times when both should be implemented.

A threshold is the point at which a behavior or metric becomes abnormal. Monitoring tools such as vROps can set and track many different thresholds of activities across the environment. Metrics include watching server fan speeds and temperatures, processor and memory demands, performance details (such as CPU usage or wait time), server cluster health and configuration details, and many other physical and virtual elements. VMware's vROps uses thresholds to create alert definitions that can generate warning messages and take other user-defined actions in response.

Static thresholds are set as a single fixed point that require an administrator to change. VMware's vROps supports dynamic thresholds which are derived from both historical and current performance data. Dynamic thresholds allow vROps to configure alerting and activity thresholds based on what the physical and virtual environment is actually doing over time.

Dynamic thresholds are usually recalculated on a regular schedule, but administrators can update them on-demand in vROps by selecting Administration> Support> Dynamic Thresholds> Start. This runs an update and processes the most recent metric data. Administrators can start and stop the calculation process, watch the completion percentage as it progresses or review the timestamps and metric counts used to formulate the dynamic thresholds.

Conventional wisdom suggests the use of dynamic thresholds wherever possible. The justification is a matter of complexity; a modern enterprise may monitor thousands -- maybe hundreds of thousands -- of metrics. It's not practical for an administrator to know the "normal" point of every threshold. While it's certainly possible to set thresholds manually, the result is often sub-optimal and is almost impossible to maintain, especially as business needs and the data center change over time.

Tools like vROps scan the entire environment, watch performance and store historical performance data in a database. When using dynamic thresholds, the tool processes the historical data and looks for patterns, then determines thresholds with clear deviations from normal behaviors. As those patterns change, the thresholds can adjust automatically without administrative intervention. It's an application of "machine learning" which is finding adoption in cloud computing services.

 

Next Steps

VMware adds new features in vRealize Operations Manager

Dig Deeper on VMware updates, certifications and training

Virtual Desktop
Data Center
Cloud Computing
Close