Enterprises are adopting data observability tools to troubleshoot data engineering and data science problems. These tools differ from traditional observability tools that focus on troubleshooting application performance or security issues and offer budget friendly alternatives.
Data observability is not only monitoring and logging, collecting metrics and being able to trace metrics over time. It interprets and makes sense of the data tracked and monitored and the created relationships between various data sources.
"Data observability tools allow organizations to solve problems by using integrated, single-source-of-truth data stores," said Cindy LaChapelle, principal consultant at ISG, a global technology research and advisory firm.
These tools use automation and artificial intelligence technologies to sift through large amounts of seemingly disparate data streams, analyze them, and centralize and integrate the results into a single source of truth.
With the growing amount of diverse data sources, businesses must provide big-picture views of their environments and enable faster, more automated solutions for identifying and fixing issues.
"Data observability tools allow businesses to be proactive whereas today, they are mainly reactive," LaChapelle said.
Why choose data observability open source tools
Enterprises have plenty of commercial data observability tools to choose from. Commercial tools have some key advantages in terms of scalability, automation and support. However, data observability open source tools allow teams to experiment with data observability capabilities with no upfront licensing costs. They might also do a better job at complementing existing data engineering workflows and tools.
"If an organization has highly customized observability needs, it's often easiest to meet these demands with a combination of in-house development and open source resources," said Steven Zhang, director of data engineering at Hippo Insurance.
Open source tools also allow enterprises to experiment with basic data observability features to see what capabilities provide the most value and how they can complement existing workflows. This can help teams evaluate commercial tools once they understand what they need.
Open source tools could help address cost concerns and prevent vendor lock-in, LaChapelle said. Over time, the requirements and needs of the business change and open source solutions offer more flexibility over premium tools.
Open source platforms tend to have a more flexible architecture than commercial solutions, but the open source platform might also require deeper business knowledge and skills to customize and evolve the platform over time.
Setting up open source tools
Some organizations prefer to explore and build their own data observability practices when they operate on a large scale, have mature IT processes and have a good critical mass of data engineering talent in-house, said Sumit Misra, vice president of data engineering at LatentView Analytics, an analytics consultancy.
However, teams should plan for a considerable engineering effort to bring all the right pieces together.
"A single open source data observability tool usually doesn't have all the features required to enable complete visibility into an enterprise's data systems," said Alisha Mittal, vice president at Everest Group, an advisory firm.
Some tools are helpful for log and metric collection, while others specialize in log and event tracing. Similarly, some are good at visualization, while others efficiently store event and metrics data. Enterprises usually must use a few of these tools in confluence to get that visibility.
Teams need to look at more tools and components when they have a multi-lingual data architecture. Also, though there are no licensing costs, enterprises need to budget for restructuring and upskilling their workforce.
"The key here is to identify which tools complement each other and work well together," Mittal said.
Top data observability open source components
Here are six of the top open source components for setting up data observability capabilities highlighted by experts in the field. These open source tools all support key aspects of a data observability practice such as log and metrics collection, event tracing or visualization of observability data, Mittal said. These projects also have a supportive community, which is required to help implement these tools and maintain them in the long run.
Consider these tools as building blocks. Many enterprises might need to bring together multiple tools to implement a working data observability practice.
Fluentd is an open source data collector. It can also serve as a unified logging layer to simplify data collection. The logging layer provides an abstraction tier for connections to multiple data sources. It supports more than 500 integrations to data sources and streaming services.
It uses a pluggable architecture that allows teams to connect the tool to new sources. Performance was a critical criterion in its design goals. It can process about 13,000 events per second on each core using only 30-40 megabytes of memory. It also extensively uses the JSON format to simplify processes for collecting, filtering, buffering and outputting logs.
It can be useful when an organization is doing data collection across many different areas and administrators need to look at data across all of these areas for monitoring and optimization, said Ashwin Rajeev, CTO of Acceldata.
Loki is a general-purpose log aggregation system that simplifies all kinds of logging processes, including for data observability. It helps streamline various processes for storing and querying logs from across various applications, data tools and cloud services.
One key feature is the ability to ingest logs in any format. It can store all data into persistent object storage that can handle petabytes of data when required. Loki can also bring metrics from various sources including Prometheus, Grafana and Kubernetes into a common user experience. It is one additional data observability tool Opstrace builds upon and works with Vector, Mittal said.
Grafana Labs maintains the project and provides commercial support for enterprises.
OpenTelemetry provides a broad collection of data capture, aggregation and analytics capabilities. It arose from a merger between OpenCensus and OpenTracing. It has a large community supported by companies like Microsoft, Google, Dynatrace and others. Although it started out in application and performance and security spaces, the same components can also apply to data observability.
OpenTelemetry makes getting started quick and simple, Mittal said. Data engineers and developers can incorporate instrumentation into their apps and data pipelines using the automatic instrumentation packages provided by the system. The tools help enterprises gather, process and release data in a vendor-agnostic format. This frees enterprises from having to support and manage several observability data formats.
OpenTelemetry can be helpful for data engineering teams tasked with supporting numerous languages and frameworks. It enables enterprises to use a uniform specification for transferring telemetry data which lowers application overhead, Mittal said.
Opstrace is a general-purpose observability platform that also supports data observability. It includes comprehensive security features, alert management and collection integrations for gathering data from other tools and services.
Opstrace makes getting started very simple, Mittal said. An Opstrace cluster can be set up and running in a matter of minutes. If enterprises already have Prometheus set up, it becomes simpler.
Enterprises can install Opstrace in their Google Cloud or AWS account to use it as a full-featured cloud-based observability platform. Integrations with transport layer security (TLS) certificates simplify security configurations for both reading and writing data.
Prometheus is an open source monitoring tool for implementing metrics and triggering alerts that can be useful for data observability. It includes vital capabilities for analyzing time series data and transforming it into derived time series based on various criteria. It comes with the PromQL query language for generating graphs, tables and alerts on the fly. It also supports a variety of visualization tools.
When enterprises have already set up Prometheus, setting up other tools like Opstrace is simpler. Enterprises simply need to make a few small changes to the Prometheus configuration file to be able to transport data from Prometheus to an Opstrace cluster, Mittal said.
Vector is a tool to assist enterprises in creating data pipelines for logs and analytics. Vector helps gather, process and distribute the spans, traces, logs and metrics of their data application to their chosen tool, Mittal said.
It uses a directed acyclic graph (DAG) based methodology to facilitate the flow and transformation of data from one stage to another. Enterprises can define the DAG using formats such as TOML, YAML or JSON files. The Vector Remap Language (VRL) can help transform observability events and set criteria for further filtering and routing events.