Increased visibility into a cloud environment gives admins a detailed view of all activity and helps address high costs, application performance trouble and security threats. While it might seem like a basic need, not all enterprises have a cloud visibility strategy.
Admins need to link cloud activity, and the associated charges, with the way users interact with cloud applications. Also, they need to link public cloud conditions -- the state of resources and application elements -- to data center conditions, as well as the conditions in other public clouds within a multi-cloud. In terms of hosting, the broader the scope of a given application is, the more complex the visibility problem will be.
Use these cloud visibility best practices to get a better view of an environment.
1. Inventory the known factors
Admins need to understand the data that's available to them and what that data tells them about performance, availability and cost. They should then correlate problems users have with the monitoring data. The purpose of the comparison is to see whether quality of experience problems reported by users are leaving any observable changes in conditions or the state of resources and applications. These opaque zones are the most common problems of cloud visibility.
For these data scope problems, add data collection points via additional monitoring data collected by middleware or with probes. A surprising number of enterprises don't fully utilize the data that their orchestration tools, such as Kubernetes and Docker, or service mesh tools, such as Istio and Linkerd, make available.
If data scope is not the problem, it could be data interpretation. Data interpretation problems can arise because of a lack of data centralization. The data available can be too voluminous or too complex to permit easy analysis. Admins can address these issues with centralized monitoring, as well as AI and machine learning (ML) technologies.
2. Add probes and tracing
For applications developed in-house, consider adding application performance monitoring probes to the code. Insert probes at specific points where it's important to establish visibility. For example, you would place an in-code trigger or probe at points where the decision logic of the program indicates some significant event occurred, such as a transaction that doesn't match anything in the database. They generate events that can then be captured and analyzed. Make sure to include the time, event type and any relevant message data in the probe's event. It's critical to facilitate the correlation of observations or conditions with each other and with user reports -- you have to tie a software probe event to other events for real analysis. DTrace is a well-known and widely used code trace tool for troubleshooting. It can also trace middleware and OS functions.
For third-party software, admins need to rely on something outside application code. The most popular concept is the bytecode trace. This type of trace uses message tags to follow work between components or steps. ManageEngine, Sentry, Catchpoint and Dynatrace are among the best-known tools for this kind of tracing. The trace data provides insights into workflow performance and identifies key components in the workflow. This helps focus monitoring attention on the right places.
3. Centralize monitoring
When information is divided, interpretation becomes difficult. Centralized monitoring collects monitoring data and stores historical data for analysis. This strategy improves visibility, and it works as long as admins collect the data they need.
A centralized monitoring strategy is a good way to capture statistics on information movement and infrastructure behavior from a variety of places. This is especially true when the separation of data limits its value in assessing cloud performance. Key tools for centralized monitoring include the open source Netdata and proprietary tools, such as AppDynamics, New Relic and Amazon CloudWatch.
4. Add AI and ML tools
AI/ML technology is now a popular way to improve cloud visibility because it enhances the speed and sophistication of data interpretation. It is often combined with a centralized monitoring strategy. AI/ML assumes that operations personnel can't interpret the meaning of available data or take appropriate action.
However, the biggest challenge in improving cloud visibility through AI data interpretation is finding tools that see all the essential data. Data ingestion capabilities, such as linkages to various data sources, and interpretation models vary widely between tools. Admins must assess the tools with their needs and data sources in mind. Even after careful review, run a trial before committing to an AI package.
AWS, Microsoft Azure and Google Cloud offer some AI cloud data analysis tools and features, and products such as LogicMonitor, Zesty and IBM Cloud Pak for Watson AIOps can be useful.
Additionally, actionability is an important aspect of cloud visibility. A record of what you know is helpful but only to the extent that it can generate useful actions for your operations team. Review how visibility strategies convert into effective cloud operations.