What is cloud application performance management (cloud APM)?
Cloud application performance management (cloud APM) is the process of monitoring resources that support software application performance in public cloud, private cloud and hybrid cloud environments. It takes actions to resolve issues and maintain optimal performance.
The major goal of cloud APM is the same as traditional APM: helping administrators quickly identify and resolve any issues with a cloud-based application, which impact either the user experience (UX) or back-end functions, such as security and costs. However, traditional APM is meant for on-premises apps and infrastructure, while cloud APM meets the needs of cloud infrastructures, providing flexibility and scalability, and handling complexities associated with them.
Cloud APM vs. application performance monitoring
The term APM is often used synonymously with application performance monitoring, a subcategory of management focused on monitoring metrics that underpin application performance and usability. APM tools have begun to evolve beyond basic monitoring and toward remediation, but true app management functionality is still mostly nascent, given the rapid growth of applications, their complexity and the involvement of teams and technologies to develop and maintain them.
In this context of cloud APM, issues are typically not remediated through the APM tool itself. The resolution process could involve on-premises adjustments for private cloud workloads, as well as tweaking cloud services and functions upon which the application depends. This may also include turning off the cloud service until the issue has been resolved. By either interpretation of APM, the first step to identify and fix application performance problems is to know what's happening. Software agents placed on an application server can monitor application, service and database response times.
This article is part of
Administrators can also use cloud APM tools to combine data from disparate monitoring silos into a correlation engine and dashboard, which makes audit logs easier to read and saves IT staff from memory-dependent and error-prone manual correlation and analysis. APM tools can display graphical representations of how applications behave on end-user devices, including index-based graphs, to measure end-user experience and satisfaction or "happiness" and gauge how service-based events affect these ratings. Cloud APM, like application performance monitoring, extends observability beyond system availability, performance and response times.
What metrics does cloud APM track?
There are multiple metrics that organizations can use to monitor the ongoing performance of their cloud applications. For instance, they can identify areas that require attention to reduce errors in the long run. The following is a list of key application performance metrics APM uses to monitor app performance:
- Resource availability. Is the instance still running, or are database requests hanging?
- Response time. Are slow response times due to network bandwidth or underlying resource issues?
- Application errors. What's their frequency and source?
- Traffic levels. How many users typically access the cloud application, and does it have sufficient scalability to handle a sudden spike in activity?
- End-user satisfaction. What's the success rate of a given task, and how long does it take?
The monitoring priority can shift depending on the workload and business needs. Also, different aspects of cloud APM may overlap, such as responses to denial-of-service and distributed DoS attacks that impact performance and security.
Monitoring isn't just used to identify problems. It's also useful to know what's operating well so you don't devote time and effort where it's not needed.
Benefits of cloud APM
Cloud APM benefits overlap with traditional APM benefits, but they are specific to cloud environments. They include the following:
- Monitoring application performance and availability. APM increases visibility into how business-critical applications are performing using relevant metrics.
- Quickly diagnosing and troubleshooting performance issues. Performance issues range from simple ones, such as defects or bugs that don't cause outages, to significant downtime that could lead to loss of business revenue. Through capabilities like real-time alerts, administrators can act quickly to rectify these issues and improve uptime.
- Helping administrators identify poor UX quickly. Administrators and DevOps engineers can monitor UX as part of the application performance analysis. They can identify what isn't working and then optimize the app to improve UX.
Challenges of cloud APM
Cloud APM platforms have their share of performance-related challenges that organizations must deal with when implementing APM in cloud-native environments. These issues can complicate workflows and cause bottlenecks. They include the following:
- Environment complexity. Cloud environments have complex infrastructures where multiple users use various services that work together. When there's an issue, pinpointing the source of problems can be difficult.
- Real-time monitoring. Cloud APM platforms aren't immune to latency issues, which can make real-time monitoring difficult at times.
- Costs. Transferring data across different services and locations can be costly. Even if a cloud APM platform is cost-effective at first, organizations must take longer-term data transfer costs into consideration.
Traditional APM vs. cloud APM
Cloud APM must account for more dependencies in application performance than traditional APM -- for example, monitoring network communications to detect problems between the application and any cloud services it requires to run. Many cloud APM tools monitor both latency and the number of incoming and outgoing requests an application makes.
Different types of cloud services are monitored in different ways. An app running in a virtualized instance produces more log data than a serverless function.
Another distinction between traditional APM and cloud APM is visibility into the underlying cloud infrastructure for IT operations metrics. An enterprise hosting its application on premises or in a private cloud can see and control its physical IT infrastructure to help fix performance issues. By contrast, traditional APM in public cloud architecture prevents deep visibility into underlying IT assets to report on metrics and criteria.
The visibility issue makes it more challenging to perform root cause analysis and troubleshoot performance problems in cloud APM environments. It's increasingly important that cloud providers have visibility into their infrastructures, especially into more complicated multi-cloud environments.
Cloud application performance management tools
As more enterprises move applications to the cloud, they increasingly require tools to monitor and manage application performance and availability across a distributed computing environment. Some monitoring tools include predictive capabilities to alert administrators of potential problems and automate the process to resolve them.
By nature, APM tools from the major public cloud providers perform cloud APM to monitor resource use, manage costs and observe network performance. Native cloud APM capabilities can offer advantages such as compatibility with and deeper traceability for services in that provider's cloud ecosystem. However, visibility into some core metrics isn't always available. The main APM tools for major public cloud platforms are Amazon's CloudWatch for AWS services, Google Cloud's operations suite (formerly Stackdriver) and Microsoft's Azure Application Insights.
Third-party APM vendors historically have advantages in their depth of reporting and visualization, as well as their ability to tie into various platforms. Increasingly, standalone APM tools integrate with cloud apps and have AI-powered capabilities. They use machine learning for tasks like advanced anomaly detection and resolution of problems in real time. Most of these vendors deliver their tools in a software-as-a-service model; some offer them as managed services, or they let clients run them in their own environments.
Third-party APM vendors include Broadcom, Cisco, Datadog, Dynatrace, New Relic, OpenText, SolarWinds, Splunk APM, Tingyun and Zoho.
Open source APM tools
With the integration complexity involved with native and vendor-specific tools across the spectrum of cloud computing, open source instrumentation has become increasingly popular, including cloud monitoring. Gartner predicted that, by 2025, half of new cloud-native application monitoring will use open source instrumentation instead of vendor-specific agents, a tenfold increase from 2019. Efforts to support open source monitoring include the OpenTelemetry project, which provides developers with software development kits, application programming interfaces and other tools to analyze software performance and behavior data.
Examples of open source platforms that provide APM for cloud-native environments include Apache SkyWalking, Jaeger, Nagios, Prometheus Group, SigNoz, Uptrace and Zabbix. These platforms let end users view monitoring metrics and other relevant information in a single pane of glass. Also, Apache SkyWalking, Prometheus, SigNoz and Uptrace support OpenTelemetry tools, like tracing, metrics and logs.
Organizations have many options for APM tools. Get insights into the different APM vendors to choose from, as well as what each offers.