By design, containers are lightweight, ephemeral and stateless. But organizations have many options when it comes to using containers for stateful applications.
Orchestrators such as Kubernetes spin up, stop, destroy and re-create containers in response to changing workload requirements. While this approach works well for small workloads and stateless applications, it does not suit applications originally designed to be stateful.
By containerizing stateful applications, organizations can take advantage of containers' flexibility, resiliency and speed. But to do so, they need ways to persist storage and manage state.
Learn how to containerize stateful applications and app components using Kubernetes, including the following:
- the difference between stateless and stateful applications;
- how to maintain state in Kubernetes with PersistentVolumes and StatefulSets; and
- top Kubernetes-based tools for container storage.
Stateless vs. stateful applications
A stateless application or component does not save any client data generated in the current session to use in future sessions. Every operation in a stateless application runs as if for the first time. Common use cases for stateless applications include content delivery networks and printing services that process short-duration print jobs.
A stateful application, on the other hand, saves specific data from previous sessions. Stateful applications are often used when user preferences and actions need to be remembered across sessions -- for example, storing items in an online shopping cart, chat history in a messaging app or data in an analytics app that processes the same information repeatedly.
It's practically impossible to run an enterprise-level application without maintaining some level of application state. But containerizing stateful applications and application components can lead to storage, state and data management challenges for IT teams.
How to run stateful applications in Kubernetes
PersistentVolumes and StatefulSets are the main approaches for running stateful applications in Kubernetes.
In a stateful containerized application, data must be persistent, retained and easy to access outside the application. This is where PersistentVolumes come into play.
PersistentVolumes are Kubernetes cluster resources: storage pieces in a cluster that reference a physical data location. PersistentVolumes make claimable storage available to applications running in Kubernetes pods outside the pod lifecycle, meaning that data is not lost when Kubernetes destroys or re-creates a pod.
The Kubernetes master waits for PersistentVolumeClaims (PVCs), which are requests for persistent volume resources made using a specific storage class -- denoted as StorageClass -- such as Ceph File System (CephFS) or Azure Disks. After users set up a new PVC for Kubernetes, a control loop finds or provisions a matching PersistentVolume and binds it to the PVC to mount the volume to the requesting pod.
At time of publication, Kubernetes supports the following PersistentVolume plugins:
- AWS Elastic Block Store
- Microsoft Azure Disks
- Microsoft Azure Files
- CephFS volumes
- Container Storage Interface
- Fibre Channel storage
- Google Compute Engine Persistent Disk
- GlusterFS volumes
- hostPath volumes
- iSCSI storage
- Network File System storage
- Portworx volumes
- RADOS Block Device volumes
- VMware vSphere virtual machine disk volumes
StatefulSets are Kubernetes objects that enable IT admins to deploy pods with persistent characteristics in a stateful application. StatefulSets maintain a sticky identity -- one that persists despite rescheduling -- for each pod and attached storage. StatefulSets include the following features:
- unique network identifiers, such as consistent instance names;
- persistent storage volumes based on user-defined storage classes; and
- ordered deployments, scaling and updates.
StatefulSets make pod identity predictable by using an ordinal index to order pods -- for example, web-01, web-02, web-03 -- rather than assigning pods random IDs. By default, Kubernetes creates and deploys pods sequentially and terminates them in reverse order.
Once created, StatefulSets ensure the desired number of pods is running at all times. In the event of a failure, StatefulSets terminate and replace pods, automatically associating new spun-off pods with persistent storage as defined in the specifications. The following code is an example manifest file for deploying a stateful web app service using StatefulSets.
apiVersion: v1 kind: Service metadata: name: nginx labels: app: nginx spec: ports: - port: 80 name: web clusterIP: None selector: app: nginx --- apiVersion: apps/v1 kind: StatefulSet metadata: name: web-statefulset spec: selector: matchLabels: app: nginx serviceName: "nginx-svc" replicas: 2 minReadySeconds: 10 template: metadata: labels: app: nginx spec: terminationGracePeriodSeconds: 10 containers: - name: nginx image: k8s.gcr.io/nginx-slim:0.8 ports: - containerPort: 80 name: web volumeMounts: - name: www mountPath: /usr/share/nginx/html volumeClaimTemplates: - metadata: name: pvdata spec: accessModes: [ "ReadWriteOnce" ] resources: requests: storage: 1Gi
Common Kubernetes-based container storage tools
Container storage tools make it easier to run stateful containerized applications with Kubernetes.
OpenEBS is an open source project for stateful workloads on Kubernetes. Organizations can adopt cloud-native storage by treating persistent storage like any other container. OpenEBS' features include the following:
- Container-native storage follows the Container Attached Storage (CAS) architecture. Container storage volumes are deployed as a dedicated pod, or pod replicas, in Kubernetes.
- OpenEBS implements high availability by replicating data volumes synchronously.
- OpenEBS has an abstraction layer over storage infrastructure implementations. It works with various underlying cloud deployments.
- Because OpenEBS follows the CAS architecture for metrics, such as latency and throughput, IT teams can track and monitor it using Kubernetes Dashboard, Prometheus or Grafana.
Portworx from Pure Storage is an enterprise container storage and data management software for Kubernetes that focuses on highly available clusters across multiple nodes and cloud instances. Portworx offers a range of software-based options for storage, disaster recovery and security.
Portworx's features include the following:
- security and encryption;
- autoscaling, where the platform resizes containers and storage volumes automatically;
- programmatic control over storage resources; and
- backups and snapshots.
Although containers are designed to be stateless, many applications today need to recall information across sessions. With container storage tools and Kubernetes' options for maintaining state, IT and DevOps teams can meet the need for stateful applications while taking advantage of the flexibility, consistency and efficiency of containers.