How to manage stateful containers with Kubernetes

Organizations can reap the benefits of containers for stateful applications by using Kubernetes to maintain state in application processes and components.

Prateek Singh, Continuum Managed Services LLC

Published: 05 Oct 2022

By design, containers are lightweight, ephemeral and stateless. But organizations have many options when it comes to using containers for stateful applications.

Orchestrators such as Kubernetes spin up, stop, destroy and re-create containers in response to changing workload requirements. While this approach works well for small workloads and stateless applications, it does not suit applications originally designed to be stateful.

By containerizing stateful applications, organizations can take advantage of containers' flexibility, resiliency and speed. But to do so, they need ways to persist storage and manage state.

Learn how to containerize stateful applications and app components using Kubernetes, including the following:

the difference between stateless and stateful applications;
how to maintain state in Kubernetes with PersistentVolumes and StatefulSets; and
top Kubernetes-based tools for container storage.

Stateless vs. stateful applications

A stateless application or component does not save any client data generated in the current session to use in future sessions. Every operation in a stateless application runs as if for the first time. Common use cases for stateless applications include content delivery networks and printing services that process short-duration print jobs.

A stateful application, on the other hand, saves specific data from previous sessions. Stateful applications are often used when user preferences and actions need to be remembered across sessions -- for example, storing items in an online shopping cart, chat history in a messaging app or data in an analytics app that processes the same information repeatedly.

It's practically impossible to run an enterprise-level application without maintaining some level of application state. But containerizing stateful applications and application components can lead to storage, state and data management challenges for IT teams.

How to run stateful applications in Kubernetes

PersistentVolumes and StatefulSets are the main approaches for running stateful applications in Kubernetes.

1. PersistentVolumes

In a stateful containerized application, data must be persistent, retained and easy to access outside the application. This is where PersistentVolumes come into play.

PersistentVolumes are Kubernetes cluster resources: storage pieces in a cluster that reference a physical data location. PersistentVolumes make claimable storage available to applications running in Kubernetes pods outside the pod lifecycle, meaning that data is not lost when Kubernetes destroys or re-creates a pod.

The Kubernetes master waits for PersistentVolumeClaims (PVCs), which are requests for persistent volume resources made using a specific storage class -- denoted as StorageClass -- such as Ceph File System (CephFS) or Azure Disks. After users set up a new PVC for Kubernetes, a control loop finds or provisions a matching PersistentVolume and binds it to the PVC to mount the volume to the requesting pod.

At time of publication, Kubernetes supports the following PersistentVolume plugins:

AWS Elastic Block Store
Microsoft Azure Disks
Microsoft Azure Files
CephFS volumes
Container Storage Interface
Fibre Channel storage
Google Compute Engine Persistent Disk
GlusterFS volumes
hostPath volumes
iSCSI storage
Network File System storage
Portworx volumes
RADOS Block Device volumes
VMware vSphere virtual machine disk volumes

2. StatefulSets

StatefulSets are Kubernetes objects that enable IT admins to deploy pods with persistent characteristics in a stateful application. StatefulSets maintain a sticky identity -- one that persists despite rescheduling -- for each pod and attached storage. StatefulSets include the following features:

unique network identifiers, such as consistent instance names;
persistent storage volumes based on user-defined storage classes; and
ordered deployments, scaling and updates.

StatefulSets make pod identity predictable by using an ordinal index to order pods -- for example, web-01, web-02, web-03 -- rather than assigning pods random IDs. By default, Kubernetes creates and deploys pods sequentially and terminates them in reverse order.

Once created, StatefulSets ensure the desired number of pods is running at all times. In the event of a failure, StatefulSets terminate and replace pods, automatically associating new spun-off pods with persistent storage as defined in the specifications. The following code is an example manifest file for deploying a stateful web app service using StatefulSets.

apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  ports:
  - port: 80
    name: web
  clusterIP: None
  selector:
    app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web-statefulset
spec:
  selector:
    matchLabels:
      app: nginx
  serviceName: "nginx-svc"
  replicas: 2
  minReadySeconds: 10
  template:
    metadata:
      labels:
        app: nginx
    spec:
      terminationGracePeriodSeconds: 10
      containers:
      - name: nginx
        image: k8s.gcr.io/nginx-slim:0.8
        ports:
        - containerPort: 80
          name: web
        volumeMounts:
        - name: www
          mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
  - metadata:
      name: pvdata
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi

Common Kubernetes-based container storage tools

Container storage tools make it easier to run stateful containerized applications with Kubernetes.

OpenEBS

OpenEBS is an open source project for stateful workloads on Kubernetes. Organizations can adopt cloud-native storage by treating persistent storage like any other container. OpenEBS' features include the following:

Container-native storage follows the Container Attached Storage (CAS) architecture. Container storage volumes are deployed as a dedicated pod, or pod replicas, in Kubernetes.
OpenEBS implements high availability by replicating data volumes synchronously.
OpenEBS has an abstraction layer over storage infrastructure implementations. It works with various underlying cloud deployments.
Because OpenEBS follows the CAS architecture for metrics, such as latency and throughput, IT teams can track and monitor it using Kubernetes Dashboard, Prometheus or Grafana.

Portworx

Portworx from Pure Storage is an enterprise container storage and data management software for Kubernetes that focuses on highly available clusters across multiple nodes and cloud instances. Portworx offers a range of software-based options for storage, disaster recovery and security.

Portworx's features include the following:

security and encryption;
autoscaling, where the platform resizes containers and storage volumes automatically;
programmatic control over storage resources; and
backups and snapshots.

Although containers are designed to be stateless, many applications today need to recall information across sessions. With container storage tools and Kubernetes' options for maintaining state, IT and DevOps teams can meet the need for stateful applications while taking advantage of the flexibility, consistency and efficiency of containers.

Next Steps

When and how to run databases on Kubernetes

How to manage stateful containers with Kubernetes

Organizations can reap the benefits of containers for stateful applications by using Kubernetes to maintain state in application processes and components.

Stateless vs. stateful applications