Enterprise IT pros are applying the Agile/DevOps philosophy of continuous improvement to Kubernetes security automation in order to keep pace with increasingly complex, multi-cluster container environments where developers also demand flexibility.
When carmaker Audi AG first began to use Kubernetes in 2017, it had one centralized platform for its DevOps team based on a proprietary Kubernetes control plane. But as the use of Kubernetes began to expand to more departments, and several public and hybrid cloud Kubernetes services emerged, Audi IT pros grappled with how to maintain both security and flexibility for the broader company.
Initially, discussions among Audi's platform teams focused on ways to standardize lower layers of the Kubernetes infrastructure to streamline security operations and cut down on management complexity. Eventually, however, they had to abandon that idea.
"There is no one size that fits all," said Sebastian Kister, team lead and platform owner for Kubernetes and public clouds at Audi, in a presentation at the virtual Red Hat Summit this month. "In a big company, if a project needs tailored infrastructure, it will make one, or buy one."
Forcing standardization among projects would only make shadow IT more likely, Kister's team realized. Worse, projects that did use standard infrastructure layers could develop risky dependencies, including on the vendors that supplied them.
Moreover, Kister and his colleagues determined that such standardization wouldn't actually cut costs -- instead, he said they realized it would come with potentially onerous tradeoffs.
"Sometimes a workload definition and an app architecture is so different [between projects] the migration effort is eliminating new feature development for months," Kister said. "Thus making innovation impossible and a return on investment pretty unlikely."
Kubernetes Operators decouple security from platforms
Still, the prospect of designing decentralized Kubernetes security automation for 30 different platforms in multiple clouds and data centers used by Audi and its parent company, Volkswagen Group, was daunting. It was made even more challenging by the fact that such automation for cloud-native apps also must be continuous to be effective.
Sebastian KisterTeam lead and platform owner, Audi AG
"There is no use for the old plan/build/run models in IT anymore," Kister said in an interview. "You can't run an infrastructure safely [by] updating it every third month -- you need to have constant updates, or you have no chance of creating a secure environment."
For Audi, the glue holding Kubernetes security automation together -- but not too tightly -- took the form of Kubernetes Operators, packages of open source code that automate the installation and update of cluster resources and application services. Third-party ISVs, Kubernetes platform vendors and Audi's internal developers can all maintain their own Operators as "integration particles," as Kister described them, that work in most Kubernetes environments.
"If they're creating a service anyway, why not create an Operator for it to mix and match it with any platform out there based on a [recent] Kubernetes version?" Kister said. "You can put the responsibility for integration not onto the platform, but the owner of the service."
Operators vary in sophistication -- level 1 Operators aren't as useful as level 3 or 5, according to Kister. But they're still a better answer to Kubernetes security automation than trying to standardize infrastructure, he said.
"You can connect and disconnect them, and when a service is not capable anymore, you can just throw it away -- it's just an Operator, it didn't cost [much]," Kister said.
GitOps streamlines Kubernetes security management
Like Audi, Discover Financial Services wanted a cohesive way to manage security policies and controls that didn't limit the number or types of clusters development teams could create. Over the last few months, Discover also added an overarching security automation layer to accomplish this, using Red Hat's Advanced Cluster Management (ACM) software for OpenShift.
ACM, introduced by Red Hat in 2020 in response to growing demand for multi-cluster management among enterprise customers, gives administrators a central view of OpenShift clusters based on various attributes. These include the part of the development process that runs on them, such as test, stage and production; their region; platform owner; AWS account; OpenShift version and more.
As with Audi, Discover's security automation through ACM is continuous -- ACM is maintained in its own Kubernetes cluster that keeps the rest in sync with the latest version of policies, constantly checking for updates and applying them immediately. These policies are recorded and version-controlled in GitHub repositories, an approach known as GitOps.
"I don't need to [design] a workflow that would deploy or update clusters," said Sriram Mageswaran, principal systems architect at Discover, in a Red Hat Summit virtual presentation. "I just need to approve a [pull request], and [an update] will go to all the clusters where it's intended to go."
DevOps pipelines automate patching, policy
Other financial services companies look to DevOps software delivery pipelines as a centralized control point for Kubernetes security automation that is also decoupled from the clusters themselves.
Citigroup, for example, recently overhauled a pipeline-based process for "repaving" its Java middleware Docker container images to apply security patch updates faster and continuously, according to a Red Hat Summit presentation. This replaced a 90-day manual patching regimen that required extensive functional tests for application-specific dependencies in a pipeline composed of Jenkins continuous integration, JFrog artifact management and Bitbucket continuous deployment.
Citigroup still uses the same pipeline tools, but it refined and unified its processes to create repeatable, reusable test modules and a standardized test framework, as well as a means of building dependent layers of application software in parallel, rather than waiting for each to be finished before building the next. Citigroup IT teams also devised an event-driven system that automatically triggers a container rebuild via Bitbucket whenever new images in a pre-production Docker registry are tagged with "latest."
The result is a fully automated process that performs exception handling, self-monitoring, self-healing, logging and audit trail functions as container images are continuously updated, according to the presentation.
"At any given moment, when an application container image is rebuilt, it will consume the latest vulnerability patches from the OS layer, Java layer and all the way up to the middleware layer," said Grace Zhang, global engineering lead for treasury and trade solutions technology at the financial services company. "With the implementation of this automation framework, the turnaround time for security patching is now reduced from 90 days down to one day."
OPA provides security automation context
At this week's cdCon virtual event, pipeline-driven Kubernetes security automation was also a common topic. One IT pro from financial services company Interswitch Group, a payments processing company based in Lagos, Nigeria, gave a presentation about using Open Policy Agent (OPA) to enforce security policies, both for applications and Kubernetes infrastructure as code, within an OpsMx Spinnaker Enterprise pipeline.
This became necessary as Interswitch and its IT infrastructure grew, and "tribal knowledge" about security policies among platform maintainers was difficult to apply consistently at scale, said AbdulBasit Kabir, a software engineer at Interswitch, in the presentation.
"You could set up pipelines to enforce policies, hard coding them into a [pipeline] tool," Kabir said. "But there are issues with this -- whenever the policy changes, you need to go and do the rework of hard-coding the change into all your pipelines. Pipelines are also not necessarily context-aware."
Instead, OPA acts as a policy decision engine decoupled from its enforcement within the Interswitch DevOps pipeline, which is easier to maintain, Kabir said.
"All the pipeline needs to do is query the policy engine, which knows the context based on what's sent in the query and has the data needed to make a decision," Kabir said.
Beth Pariseau, senior news writer at TechTarget, is an award-winning veteran of IT journalism. She can be reached at [email protected] or on Twitter @PariseauTT.