Running Stateful Apps in Kubernetes

With Kubernetes you will eventually, have the need to run stateful applications in Kubernetes. This is more common than you think. If you have never run stateful apps on Kubernetes before this can be a scary thing adding more moving parts to a Kubernetes cluster, deploying the app, as well as managing your stateful application/s on Kubernetes when it requires state.

In this blog post I am going to take you on a short journey to gain an understanding of Stateless vs Stateful applications, how storage works in Kubernetes touching on volumes, storage classes, persistent volumes (PC), and persistent volume claims (PVC), what Stateful Sets are, about Persistent state with pods, and good practices for running Stateful Apps on Kubernetes.

Stateless

A stateless app is an application program that does not save client data generated in one session for use in the next session with that client.

Stateful

A stateful app is a program that saves client data from the activities of one session for use in the next session.

The data that is saved is called the application’s state. Here is a visual covering the differences between Stateless and Stateful applications:

Volumes

Here is a breakdown of what volumes are:

  • A volume is a directory, typically with data in it, that is accessible to the containers in a pod.
    • A volume represents a way to store, retrieve, and persist data across pods through an applications lifecycle.
    • Volume modes in Kubernetes supports are Filesystem or Block.
    • Volumes are backed by different types of storage such as NFS, iSCSI, or other cloud storage (i.e. awsElasticBlockStore, azureDisk, gcePersistentDisk etc..).
    • When pods ceases to exist, Kubernetes destroys ephemeral volumes; however, Kubernetes does not destroy persistent volumes.

StorageClasses

Here is a breakdown of what volumes are:

  • Define types of storage tiers like Premium and Standard through Storage Classes in Kubernetes.
    • Give K8s admins a way to describe the “classes” of storage they offer.
    • StorageClasses define the provisioner, parameters, and reclaimPolicy used when a PersistentVolume is provisioned.
    • When a pod is deleted the underlying storage resource can either be deleted or kept for use with a future pod.
    • A reclaim Policy controls the behavior of the underlying storage resource when pod & the its persistent volume are no longer required.

Example of a configuration file for a StorageClass:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: managed-premium-retain
provisioner: kubernetes.io/azure-disk
reclaimPolicy: Retain
parameters:
  storageaccounttype: Premium_LRS
  kind: Managed

Reclaim Policy

Here is a breakdown of what Reclaim Policies:

  • Retain –
    • Allows for manual reclamation of the resource. The PV is not available for another claim due to previous claimant’s data remaining on the volume. A K8s admin must manually reclaim the volume.
    • Delete –
      • The delete reclaim policy removes the PV resource from the K8s cluster, & the associated storage asset such as cloud storage, NFS etc…
    • Recycle –
      • Performs a basic scrub on the volume & makes it available again for a new PVC.

Persistent Volumes (PVs)

Here is a breakdown of what Persistent Volumes are:

  • A persistent volume (PV) is a storage resource created and managed by the Kubernetes API that can exist beyond the lifetime of an individual pod.
    • A Persistent Volume can be manually provisioned by an Kubernetes admin or dynamically provisioned using Storage Classes by the Kubernetes API server.
    • Dynamic provisioning uses a StorageClass to identify what type of storage (NFS, iSCSI, or cloud-based) needs to be created.

Example of a configuration file for the PersistentVolume:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv0010
spec:
  capacity:
   storage: 40Gi
  volumeMode: Filesystem
  accessModes:
   - ReadWriteOnce
  persistentVolumeReclaimPolicy: Recycle
  storageClassName: slow
  mountOptions:
   - hard
   - nfsvers=4.1
  nfs:
   path: /tmp
   server: 172.19.0.22

Persistent Volume Claims (PVCs)

Here is a breakdown of what Persistent Volumes Claims are:

  • A PersistentVolumeClaim (PVC) is a request for storage by a user.
    • A PersistentVolumeClaim specifies the volume mode of either Block or File storage from a StorageClass, the access mode, and the capacity needed.
    • PVC Access Modes Are:
      • ReadOnlyMany (ROX) allows being mounted by multiple nodes in read-only mode.
      • ReadWriteOnce (RWO) allows being mounted by a single node in read-write mode.
      • ReadWriteMany (RWX) allows multiple nodes to be mounted in read-write mode.

Example of a configuration file for the PersistentVolumeClaim:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc0002
spec:
  storageClassName: manual
  accessModes:
   - ReadWriteOnce
  resources:
   requests:
    storage: 10Gi

Lifecycle of a Volume & Claim

Let’s take a look at how the lifecycle of volumes and claims flow:

StatefulSets

Here is a breakdown of what Stateful Sets are:

  • StaefulSets are Kubernetes objects that are used when we need each pod to have its own independent state & use its own individual volume.
    • With StatefulSets each pod is assigned a unique name & the unique name stays with it even if the pod is deleted & recreated.
    • Headless services are primarily used when we deploy statefulset applications. Headless services don’t operate like load balancers. Headless services are not assigned IPs like a regular service is.

StatefulSets are typically used when the following is needed:

  • unique network identifiers for pods
    • persistent storage for retaining data
    • Ordered, graceful deployment, & scaling of pods
    • Ordered, & automated rolling updates of the app

Some Good Practices When Running Stateful Apps on Kubernetes

That wraps up this blog post! Thanks for reading and stay tuned to my blog for more content on Kubernetes soon.

Read more