DETROIT -- A new Istio service mesh architecture that ditches sidecar proxies is turning enterprise IT pro heads with its promise of simpler operations, but proponents of the rival Linkerd project argue that the real problem isn't the sidecar architecture -- it's the Envoy proxy.
The service mesh approach to networking in distributed application environments first came about with Linkerd version 1 in 2016, which was designed for VM environments. Istio, backed by Google, IBM and Lyft, emerged in 2017 specifically for use with Kubernetes container orchestration. Linkerd version 2 followed with a focus on Kubernetes, and since then, the growing ubiquity of container orchestrators has raised the popularity of service mesh as a way to shift the burden of complex microservices networking away from both the application layer and the developers who create applications.
Until this year, the basic architectural design of each of these service meshes was the same -- both used a specialized type of container called a sidecar proxy to offload network management from applications. These sidecar proxies were closely tied to applications, deployed as part of each Kubernetes pod, and this closeness allowed for finer-grained control over application routing and monitoring than was possible with traditional networks.
But as service mesh has gained more widespread use in high-scale environments, problems with sidecar proxies have emerged. Having a container tied to each application in highly performance-sensitive environments can cause untenable overhead. It can make upgrades to the service mesh painful because it requires all sidecars to restart, which can affect application availability.
It's also possible for application containers to get out of sync with sidecar containers, creating more potential reliability issues. And managing a huge fleet of sidecars can be an unjustifiable burden in environments where applications might need some features of the service mesh, such as mutual TLS (mTLS), that occur at lower layers of the Open Systems Interconnection model -- specifically, Layer 4 -- but don't need all the finer application-level filtering that occurs higher up at Layer 7.
Istio Ambient Mesh, an experimental-stage project that engineers from Google and Solo.io donated to open source in September, contains a new architecture that maintainers say sidesteps these issues with service mesh sidecars.
Instead of bundling all the features of the service mesh into a sidecar deployed with every app, Ambient Mesh breaks the proxy into a set of two shared resources, called DaemonSets, deployed on each Kubernetes cluster. IT admins can indicate whether applications require Layer 4 or Layer 7 routing features using the same kinds of Istio configuration files and Kubernetes app files they already have. The consolidated proxies in Ambient Mesh will route the traffic accordingly, without requiring a sidecar for every pod.
It's still early for this new approach, but some Istio users said they're eager to check it out.
"It's amazing -- we're going to adopt it ASAP," said David Ortiz, principal software engineer at martech company Constant Contact, in an online interview this week. "It significantly simplifies the operations of Istio, specifically around upgrades."
One KubeCon attendee said he plans to evaluate Ambient Mesh carefully as it matures, but he's interested.
"Sidecars were helpful to get things going, but we like the idea of being able to service and scale Layer 7 and Layer 4 differently," said Greg Otto, executive director of cloud services at cable provider Comcast, in an interview here this week. "At the edge, we're very focused on Layer 7 [filtering], but we don't want to carry all of that through our entire [network] where Layer 4 [routing] is more appropriate."
Greg OttoExecutive director of cloud services, Comcast
While sidecar proxies offer the strictest separation between services for security purposes, most of the critical Common Vulnerabilities and Exposures (CVEs) in Istio's Envoy proxy have been at the Layer 7 level, Otto said.
"Where we don't need [Layer 7 filtering], I don't want to have to carry it," he said. "Because then, if there is a CVE, I have a much smaller attack surface that I don't have to worry about."
Linkerd counterpoint: The problem isn't sidecars, it's Envoy
There's another way to reduce critical vulnerabilities at Layer 7 and much of the resource overhead associated with sidecars, according to Linkerd creator and Buoyant CEO William Morgan: Don't use Envoy.
"At the end of the day, sidecars are actually extremely simple -- they're very straightforward operationally, people understand them, and the failure and security domains are very clear," Morgan said. "The issue is not the sidecar -- the problem is that you have this giant, multipurpose, resource-hungry and hard-to-operate proxy [with Envoy]."
Support for Envoy, a popular graduated Cloud Native Computing Foundation project, has been a selling point for Istio over Linkerd in the past. But Linkerd maintainers, led by Morgan, have stuck to their own proxy, designed solely for use in a service mesh and with a smaller codebase and smaller resource requirements than Envoy.
As a result, one enterprise Linkerd user said he hasn't seen a need for sidecarless service mesh, and it's possible to still have simplicity and transparency with a sidecar.
"From our perspective, a sidecar is simple and easy to understand -- it's the same [kind of container] technology we use for everything else," said Kasper Nissen, lead platform architect at Lunar, a digital financial services company based in Denmark, in an interview here this week. "We went full service mesh by default for everything a year and a half ago, and we saw maybe a 10% increase in resource consumption, which wasn't that much compared to all the features we were getting like mTLS and [detailed] visibility."
Nissen said he has encountered sync issues with sidecar proxies and the Humio log analytics app Lunar uses. That service didn't have time to offload its local data if a sidecar restarted, which meant some data would go missing before Nissen's team found a workaround that amounts to "set a timeout and hope for the best," he said.
However, Morgan and Nissen contended that the issue of sidecar synchronization has its roots in a deeper problem with Kubernetes networking that has remained unsolved in the open source community for three years. By default, there isn't a way to ensure that various containers, whether short-lived init containers used by services such as Linkerd or regular application containers, spin up and spin down in a specific order. A Kubernetes Enhancement Proposal was created to address this in 2019, but was rejected; discussions have continued in the community, but the situation hasn't changed.
"It's something you would expect Kubernetes to be able to do now, especially with all the services being delivered as sidecars," Nissen said.
Fixing this issue in Kubernetes is the best way to address sidecar synchronization issues, Morgan said.
"It's not a very exciting statement to make in the fad-driven cloud-native world, but sidecars will continue to be the future of service mesh," Morgan said. "We know they have warts, but many of them will ultimately be addressed by Kubernetes changes, not by changing the architecture dramatically and having the things that are a lot more complicated operate your infrastructure."
Beth Pariseau, senior news writer at TechTarget, is an award-winning veteran of IT journalism. She can be reached at [email protected] or on Twitter @PariseauTT.