Container logging tips for IT troubleshooting and more
Don't just leave container log data on a host and forget about it. Instead, establish a detailed strategy to index, search, correlate and analyze that data.
Containers have become a welcome approach to accommodate vital processes such as continuous development, automation and immutable infrastructure. But containers are ephemeral and can proliferate rapidly as workload needs change -- which makes container management and maintenance critical.
Logs and log management are at the core of a container management strategy, enabling container visibility, troubleshooting and performance enhancements.
The challenge with container logs
IT admins can spin up containers, and tear them down, faster than they can VMs, which enables rapid scalability. As a result, containers are often ephemeral and last as little as a few seconds, depending on how long the instance is needed.
Containers' transient nature make them ideal for agile software development. But containers also create new virtualization management challenges -- particularly around logs and log management.
Short life cycles make log data difficult to store. Because of their short lifespans, containers are ideally stateless and non-persistent entities; they are not natively designed to store persistent data. This makes it a challenge to collect and store logs. For example, if a log is tied to the container itself, stopping the container will typically stop log collection for that container as well, and the data is destroyed unless otherwise protected, or exported to a storage resource.
Container logging must involve multiple perspectives. There are three different environments or levels that enable a container to run an application: the container itself; the container engine, such as the Docker daemon; and the shared host OS. Proper logging in a containerized environment must identify and correlate log events across all three of these environments, so that IT or development teams can trace the root cause of problems.
Enhanced container security increases logging complexity. OS vulnerabilities can affect all containers that share the common kernel. To strengthen container security, some organizations run containers within a VM. However, this increases logging complexity, as it requires admins to log activity not only in the application, the engine and the host OS, but also the companion VM and hypervisor.
Four container logging options
Choose the best container logging approach based on the business' unique needs, but always enforce logging related to the application, the host operating system and the container engine.
Four common options include:
Application-based logging. In this approach, the application within the container handles its own logging. For example, a Java application within a container might use the Apache Log4j utility to produce logs and send those log entries to a remote centralized server. This method is most reminiscent of logging in traditional monolithic applications, and therefore is extremely helpful for organizations that are transitioning to container technologies.
The application-based approach, however, really doesn't "see" the container environment, so the logging bypasses both the operating system and the container engine. And since the logging utility runs within the container itself, it adds load and potentially lowers the application's performance. Further, the logging tool defaults to local, non-persistent storage, so log data is lost when the container is destroyed, unless admins forward it to a persistent location.
Volume-based logging. Since containers are typically stateless and ephemeral, any files and data generated within a container, including log data, are lost when the container is destroyed. As mentioned above, to access log data after the container is gone, send it to a remote centralized server -- or store log data in a data volume.
A container engine, such as Docker, supports volume storage in a directory area that is set aside and mapped specifically for the container engine's use. The volume feature enables multiple containers to share a single volume, which makes it easier to centralize logging activities across the entire container environment. Since log data is in storage, admins can copy or back it up for analysis or data protection. The one disadvantage to the volume feature is that the storage area is set up for a corresponding container engine; moving containers to different hosts might disrupt the data storage, resulting in lost log data.
Docker-native logging. Another way to forward log data for correlation and analysis is to use a logging driver or service that is native to the container engine, such as the Docker logging driver. The Docker logging driver forwards log events from each container to a syslog instance on a host system.
A native logging driver gathers log events from the container's stdout and stderr -- output and error paths, respectively. It offers a simple and direct way to centralize logs without the need to interact with log files.
A dedicated logging container. The examples above involve services designed to capture and forward log data in a persistent manner. An alternative approach is to use logging with a dedicated logging container that admins manage from within the container engine itself. The logging container can support many containers to receive, aggregate, store and forward logs to an application or service for analysis. For example, a container can run a tool such as logspout to capture stdout and stderr data from any containers on the host system, and then forward that data to a remote syslog service.
The notion of a logging container is well aligned with microservices architectures. The principal advantage to dedicated logging containers is mobility -- an array of containers, including the logging container, can move easily from system to system without the need to worry about any dependencies on a particular host.
Container log management best practices
There is no single "right" method to approach container logging, but there are several best practices.
Weigh performance effects. Each container logging approach, as outlined in the previous section, poses tradeoffs in terms of convenience and application performance. Test a logging approach to ensure it collects adequate log data without any negative effects on the underlying application.
Consider container-aware tools. Containers' speed and ephemeral nature can make them difficult to add to, or be removed from, monitoring platforms. Evaluate centralized tools, such as Docker App for Sumo Logic and Google's cAdvisor, designed to operate with container engines to monitor and manage containers.
Look for integrations. Ideally, a container monitoring and management tool should integrate with a wealth of platforms and engines to deliver logs across complex infrastructures. This reduces the number of tools required to support the environment and enables more comprehensive correlation and analytics. Tools such as Sumo Logic provide dozens of integrations, including with Docker, Kubernetes and Amazon Elastic Kubernetes Service.
Embrace analytics. Seek ways to use container logging data and metrics for analysis and troubleshooting. For example, a logging tool captures and correlates errors, while a monitoring tool tracks items such as container actions, faults and CPU/memory.
Set log retention and deletion policies. Logs can consume tremendous amounts of storage space, yet decrease in value over time. This means long-term log retention typically brings diminishing returns. Implement log retention and deletion policies, based on the regulatory needs of the business, to save valuable storage capacity.
Secure log files. Logs often include sensitive details related to IT infrastructure, such as host names, directory paths, environment variables and IP addresses. Wherever possible, use access credentials and encryption to protect log files.
Keep logging separate and available. Place a logging infrastructure in an environment that is both logically and physically separate from the applications being watched. When logging is deployed in the same environment as the applications, the same issues that affect application performance can affect logging -- and render the logging platform useless. In addition, deploy the logging tool or platform in a cluster configuration to ensure better resiliency and availability.
Gain deeper IT insight with machine learning for log analysis
Consider the pros, cons of AI-based log analysis tools