Every technology change brings benefits and risks, and hybrid cloud is no exception. Companies have gained significant business performance and productivity when they combine public cloud technology with their legacy data center applications.
But to create this symbiotic mix, they've had to increase their application complexity. With that increase, they face a growing problem to identify and resolve problems, as well as the importance of actionable information from telemetry. Consequently, observability has exploded. How can that challenge be met?
Monitoring vs. observability
Monitoring is the traditional method to gather information about application state and performance, and it works well for monolithic applications and simple data center hosting.
Public cloud platforms also offer monitoring services, and many enterprises that adopt hybrid cloud think that those two monitoring sources are enough to resolve issues. Nearly all find that they aren't, because the data from all the locations where applications run is difficult to interpret.
The difference between monitoring and observability is that the latter is inherently application "holistic." Observability offers a view into an application's state, regardless of how many components make it up or where those components happen to be hosted. Observability assumes an application is an ecosystem that must be understood as a cooperative system of elements. The system's status is derived from the combined state of all its elements and workflows.
What is hybrid cloud observability?
The critical piece of hybrid cloud observability is the workflows, for two reasons:
First, workflows define how components and hosting points combine to create the application ecosystem. That ecosystem is the relationships between elements that enable IT professionals to interpret individual monitoring results.
Second, workflows represent the network's connectivity contribution to the application, and that contribution is vital in hybrid cloud computing. The network is the critical -- and sometimes hidden -- piece of observability in the hybrid cloud. Most successful observability strategies will focus on the network.
Enterprises that successfully address hybrid cloud observability do so primarily through three techniques:
- network traffic analysis;
- automatic tracing to track workflows and identify integral application components; and
- application performance monitoring (APM) and code modification to introduce contextual triggers.
Network traffic analysis. For hybrid cloud observability, network traffic analysis is based on a familiar idea: Diagnosing application problems is usually based on problem isolation.
Because network connections link application components in the cloud to partner components in the data center, it's possible to monitor the cloud-to-data center traffic and identify specific application workflows. These flows can then isolate a problem to a specific cloud or data center and, from there, traditional monitoring can isolate the issue. The challenge here is traffic identification at the data rates involved in the hybrid cloud connections.
Automatic tracing. Tracing is the general approach to hybrid cloud observability because it doesn't depend on access to source code. It also focuses on workflows that link components and the network connectivity that supports them, which ensures that network behavior is introduced into application monitoring.
However, the automatic tracing process isn't a completely positive mechanism for workflow and component identification. And it can be problematic where there's significant component reuse; within the public cloud, where a large number of web services are used; and where traffic flows and component connections are hidden by the cloud provider.
Code modification and APM. These actions enable enterprises to generate tracing signals at specific points, and a strong automatic tracing strategy incorporates at least some technologies for code modifications -- usually open source options -- to enhance observability. Because the inserted code identifies the developer's specific set of conditions for observability, APM provides the most precise approach to improve observability in a hybrid cloud model.
The disadvantage of APM is that not all tools support all programming languages, and source code is not available for all application components.
Many of the enterprise hybrid cloud applications involve new development in the cloud, linked to legacy components in the data center. The boundary point between the two can be a small "shim" layer -- which connects app interfaces to modules -- of data center code. As developers can introduce code modifications easily into any new development project, a shim layer at the cloud boundary facilitates the interpretation of cloud and data center monitoring data by providing a contextual link between the two, which has been created by the code modification's generated trace event.
The best approach for hybrid cloud observability is often a combination of the three strategies identified here. APM and code modification approaches will offer the best approach where they can be applied, and both automatic tracing and network traffic analysis will fill in any critical gaps.
It's important to remember that for most enterprises, no observability strategy will be 100% effective, and in most cases it's possible to carry attempts at full observability too far, increasing costs without a corresponding gain in benefits. Once the majority of issues can be reliably identified and resolved, it's best to measure further changes in cost/benefit terms to avoid undue operational and financial impacts. More observability isn't always better.