Cloud workload management doesn't stop after you integrate a new service or app. Admins must ensure an AWS deployment remains available and runs within acceptable performance parameters. Amazon CloudWatch collects and processes data from resources, and then delivers AWS performance metrics in near-real-time to help admins do just that.
It's important to spot AWS performance problems as they occur and rectify them as quickly as possible. CloudWatch metrics can alert cloud administrators to service disruptions for rapid troubleshooting. Admins should record and evaluate these metrics over time to create a long-term historical perspective on the performance of each service instance. That perspective can help optimize configurations, add or change services, and make other strategic decisions that improve workload performance.
While AWS does offer a high degree of automation for monitoring and alerting, it's important to find the right response plan for your organization. Decide what information you need, and what to do with it when you get it. Items to define include:
- the purpose of monitoring;
- which resources to monitor;
- which monitoring metrics hold more importance;
- the frequency of monitoring each metric;
- the range of acceptable parameters for alerts and events; and
- which individuals will handle monitoring, notification and response.