In previous parts of this series, we walked you through StorageClass as one of the Kubernetes objects for data persistence. Let’s now look at another persistent data object referred to as a StatefulSet. We’ll cover the topics below and some hands-on practice to show you the functionalities of this object.
- What is a StatefulSet?
- How do StatefulSets differ from Deployments?
- How to specify Pods easily inside of StatefulSets?
- How to manage Volumes in a Pod?
- Why use a Service for StatefulSets?
What Is a StatefulSet?
A StatefulSet is a Kubernetes object used to deploy and manage stateful applications. Stateless applications, you may recall from a prior part of this series, are deployed using various Kubernetes objects like Deployments and Pods. We then covered data persistence to manage Stateful applications. So, what are stateful and stateless applications?
Stateful Applications are applications that are mindful of their past and present state. In a nutshell, the applications that monitor or keep track of their state. They store data using persistent storage and read the data later to survive service breakdown or restarts. Database applications like MySQL and MongoDB are examples of stateful applications.
On the other hand, Stateless Applications are applications that do not monitor any of their states. They neither store nor read data from any storage; they are basically a one-time request and feedback process. When a stateless application’s current session is down, interrupted or deleted, the new session will start with a clean slate, without referring to past events or processes. Examples include the Nginx web application and Tomcat web server.
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀How do StatefulSets differ from Deployments?
So, if you need a:
unique and ordered deployment and scaling,
distinct and stable network identities,
steady and persistent storage across all application scheduling and rescheduling,
then, you would use StatefulSet instead of Deployment.
How to Specify Pods Inside StatefulSets
Pods in a StatefulSet have a sticky and unique network identity. They are specified inside a StatefulSet by declaring a “replicas” field as a child of a “spec” property in the StatefulSet YAML manifest. The desired number of Pod depends on the “replicas” value specified in the manifest file. The configuration will look like this:
spec: selector: matchLabels: app: service-label ## This must be the same as the Pod template and service labels replicas: 3 ## It is 1 by default. The value specified here will determine the number of replicated Pods the StatefulSet will create; in this case, it will be 3 Pods.
How to Manage volumes in the specified Pod in a StatefulSet
In the last part of this series, we created a Pod that consumes storage as a volume using PVC. A “persistentVolumeClaim” field was declared in the manifest YAML file which gave the Pod access to the PersistentVolume that the PVC is bound to. In the case of a StatefulSet, a different property will be used, namely “volumeClaimTemplates”. The template name, accessModes, storageClassName and storage requests fields are declared under this property. The Pods access the storage through this section and then mount it into the Pods’ containers using the “volumeMounts” field. The claim and mount configuration in a StatefulSet manifest YAML file will look like this:
volumeMounts: - name: my-volume mountPath: /data/path volumeClaimTemplates: - metadata: name: my-volume spec: accessModes: [ "ReadWriteOnce" ] storageClassName: "my-stg-class" ## Dynamic storage provisioning resources: requests: storage: 1Gi ## Storage request
You’ll see more about this when we get to “How to create a StatefulSet session” later in this blog post.
What Is the Role of a Service in StatefulSets?
A Service is needed in a StatefulSet for communication across Pods. It links all of the Pods in the StatefulSet and also controls its network domain. The question is, what type of service is suitable for a StatefulSet? A StatefulSet needs a headlessService for Pods discovery and to maintain the Pods’ sticky network identity, which is one of the characteristics of StatefulSets. Moreover, you need another Service type to expose the application to the outside world or get an external IP. You can read more on Services in an earlier part of this series.
The Headless Service is referenced in the StatefulSet manifest YAML file by declaring a “serviceName” field as a child of a “spec” property. The value of this field must be the same as the Headless Service name.
How to Create a StatefulSet
A StatefulSet is created by declaring a manifest YAML file just like Deployment, but with a different “kind” value; in this case, StatefulSet. There are also various components highlighted above that are needed to create a StatefulSet. These are Headless Service, persistentVolume (PV) and persistentVolumeClaim (PVC). PV & PVC are required if you are using static storage provisioning in a local cluster. If, however, you are using cloud provider storage from AWS, Azure or GCP, which allows for dynamic storage provisioning, creating a storageClass object would be a suitable option. You can read more on static (PV & PVC) and dynamic (storageClass) storage provisioning methods in a previous part of this series.
Before we begin, it is advised to have a basic knowledge of Kubernetes objects like Pod, Deployment, Service, volumes & volumeMount, PV, PVC and storageClass, to follow this exercise. We will go through a hands-on practice on a running Kubernetes cluster, so it’s imperative to have one with the
kubectl command-line tool already configured to talk to the cluster. KubeOne allows you to create a Kubernetes cluster in any environment easily. Check out our documentation on this to get started. Alternatively, you can just use the Kubernetes playground to practice.
The following steps will guide you on how to create a StatefulSet and other necessary components.
First, create the headless service by setting the clusterIP field to “None”. The configuration will look like this:
$ vim headless-service.yaml
Copy the below configuration into the above file.
apiVersion: v1 kind: Service metadata: name: my-service labels: app: service-label spec: ports: - port: 80 name: web clusterIP: None ## Headless service selector: app: service-label
Step 2: Create the headless service using the
kubectl create command.
$ kubectl create -f headless-service.yaml service/my-service created
Step 3: Use
kubectl get command to check the details of the service.
$ kubectl get service my-service NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE my-service ClusterIP None <none> 80/TCP 37s
Step 4: Create a storageClass with the below YAML manifest file. storageClass is used in this case because a host cloud provider was used for the cluster. If you are using a local cluster, you will need to create PV and PVC. Check our previous post on how to provision storage using PV and claim it using PVC.
$ vim s-class.yaml
apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: my-stg-class ## Name of the SC that will be referenced in the StatefulSet manifest file provisioner: kubernetes.io/aws-ebs ## Provisioner for AWSElasticBlockStore plugin volumeBindingMode: WaitForFirstConsumer # The binding will wait until the StatefulSet is created
kubectl create command to create the storageClass.
$ kubectl create -f s-class.yaml storageclass.storage.k8s.io/my-storageclass created
Step 5: Check the details of the storageClass using
kubectl get command:
$ kubectl get storageclass my-storageclass NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE my-storageclass kubernetes.io/aws-ebs Delete WaitForFirstConsumer false 3m37s
Step 6: Now copy and paste the below configuration in a YAML file with your choice’s name to create a StatefulSet object.
$ vim stateful-set.yaml
apiVersion: apps/v1 kind: StatefulSet metadata: name: my-stateful-set spec: selector: matchLabels: app: service-label # has to match .spec.template.metadata.labels serviceName: "my-service" # has to match the created service name. replicas: 3 # Number of Pod replicas to be created. It is 1 by default template: metadata: labels: app: service-label # has to match .spec.selector.matchLabels spec: terminationGracePeriodSeconds: 10 containers: - name: nginx image: nginx ports: - containerPort: 80 name: web volumeMounts: - name: my-volume mountPath: /data/path volumeClaimTemplates: - metadata: name: my-volume spec: accessModes: [ "ReadWriteOnce" ] storageClassName: "my-storageclass" #must match the created StorageClass name resources: requests: storage: 500Mi # storage request
Create the StatefulSet using the
kubectl create command:
$ kubectl create -f stateful-set.yaml statefulset.apps/my-stateful-set created
You can view the processes of how the Pods are being created using
kubectl get pods -l app=service-label -w command where app=service-label is the same as the label used in the manifest file. You also need to install tmux to divide your terminal into two, which will allow you to run the commands in the first terminal and watch the processes in the second. You must first run the above command on the first terminal to watch the processes before running the command to create the StatefulSet on the second.
If everything works well, the viewed terminal output should look like this:
NAME READY STATUS RESTARTS AGE my-stateful-set-0 0/1 Pending 0 0s my-stateful-set-0 0/1 Pending 0 6s my-stateful-set-0 0/1 ContainerCreating 0 6s my-stateful-set-0 0/1 ContainerCreating 0 24s my-stateful-set-0 1/1 Running 0 36s my-stateful-set-1 0/1 Pending 0 0s my-stateful-set-1 0/1 Pending 0 6s my-stateful-set-1 0/1 ContainerCreating 0 6s my-stateful-set-1 0/1 ContainerCreating 0 12s my-stateful-set-1 1/1 Running 0 15s my-stateful-set-2 0/1 Pending 0 0s my-stateful-set-2 0/1 Pending 0 6s my-stateful-set-2 0/1 ContainerCreating 0 6s my-stateful-set-2 0/1 ContainerCreating 0 23s my-stateful-set-2 1/1 Running 0 26s
The above output shows the order in which the Pods are created. Use
kubectl get command to check the Pod status to see if they are ready.
Step 7: Check the details of the StatefulSet, Pods, PV and PVCs using
kubectl get command:
Check the details of the StatefulSet:
$ kubectl get statefulset my-stateful-set NAME READY AGE my-stateful-set 3/3 5m25s
The output shows that the 3 Pod replicas have been created and ready.
Check the status of the Pods:
$ kubectl get pods NAME READY STATUS RESTARTS AGE my-stateful-set-0 1/1 Running 0 7m6s my-stateful-set-1 1/1 Running 0 6m44s my-stateful-set-2 1/1 Running 0 6m26s
Check the status of the PersistentVolume:
$ kubectl get pv NAME CAPACITY ACCESS MODES STATUS CLAIM STORAGECLASS AGE pvc-1dcfd012 1Gi RWO Bound default/my-volume-my-stateful-set-1 my-stg-class 5m8s pvc-2ceee9d3 1Gi RWO Bound default/my-volume-my-stateful-set-2 my-stg-class 4m49s pvc-8cfe94f5 1Gi RWO Bound default/my-volume-my-stateful-set-0 my-stg-class 5m30s
Check the status of the PersistentVolumeClaim:
$ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE my-volume-my-stateful-set-0 Bound pvc-8cfe94f5 1Gi RWO my-stg-class 34m my-volume-my-stateful-set-1 Bound pvc-1dcfd012 1Gi RWO my-stg-class 33m my-volume-my-stateful-set-2 Bound pvc-2ceee9d3 1Gi RWO my-stg-class 33m
In the above output, the Pod names have specific numbers attached to them from 0 - 2, unlike Deployment where random numbers and letters are attached to a Pod name. Moreover, the Pods are created sequentially. The Pod “my-stateful-set-0” is created first, followed by “my-stateful-set-1” and then “my-stateful-set-2”. If you don’t want them labelled sequentially, you can include a “podManagementPolicy” property in the StatefulSet YAML file with its value set to “parallel”.
StatefulSet Use Case
We will test the use case of StatefulSet using the following steps: getting into one of the Pods, creating a file and deleting the Pod. The Pod will be recreated using replicas. We will then check if it has the same name as the one that was deleted and if the data created earlier still exists in the Pod.
Step 1: Exec into one of the pods using the
kubectl exec command. Pod “my-stateful-set-1” will be used in this case.
$ kubectl exec -it my-stateful-set-1 -- bin/bash root@my-stateful-set-1:/#
Step 2: Change to the directory where the volume is mounted, create and save a file into the directory:
root@my-stateful-set-1:/# cd data/path root@my-stateful-set-1:/data/path# echo This is a StatefulSet Message > stset.txt root@my-stateful-set-1:/data/path# cat stset.txt This is a StatefulSet Message pod "my-stateful-set-1" deleted
Step 3: Exit from the Pod, delete the Pod and let it recreate:
$ kubectl delete pod my-stateful-set-1 pod "my-stateful-set-1" deleted
Check the Pod status to see if they are up again:
$ kubectl get pods NAME READY STATUS RESTARTS AGE my-stateful-set-0 1/1 Running 0 33m my-stateful-set-1 1/1 Running 0 6s my-stateful-set-2 1/1 Running 0 33m
The output shows that the Pod is up again with the same name. One way to know this is to check the difference in the “AGE” column of all the Pods.
Step 4: Exec into the Pod once again and check the data created earlier before the Pod was deleted:
$ kubectl exec -it my-stateful-set-1 -- bin/bash root@my-stateful-set-1:/# cd data/path root@my-stateful-set-1:/data/path# ls stset.txt root@my-stateful-set-1:/data/path# cat stset.txt This is a StatefulSet Message root@my-stateful-set-1:/data/path# exit
Scaling StatefulSet up or Down
You can scale StatefulSet up or down by running the below command on the terminal. The value depends on how many Pod replicas you need.
To scale down:
We will scale down the previous Statefulset from 3 to 1 replica using
kubectl scale command and also watch the process to see the scaling order.
$ kubectl scale statefulset my-stateful-set --replicas=1
Check the StatefulSet and Pod status with
kubectl get command:
$ kubectl get statefulset NAME READY AGE my-stateful-set 1/1 6m
$ kubectl get pod NAME READY STATUS RESTARTS AGE my-stateful-set-0 1/1 Running 0 15m
The StatefulSet will be scaled-up from 1 to 6 replicas.
$ kubectl scale statefulset my-stateful-set --replicas=6
Check the StatefulSet status:
$ kubectl get statefulset NAME READY AGE my-stateful-set 6/6 102m
kubectl get command to check the status of the Pods:
$ kubectl get pods NAME READY STATUS RESTARTS AGE my-stateful-set-0 1/1 Running 0 59m my-stateful-set-1 1/1 Running 0 59m my-stateful-set-2 1/1 Running 0 59m my-stateful-set-3 1/1 Running 0 2m47s my-stateful-set-4 1/1 Running 0 2m32s my-stateful-set-5 1/1 Running 0 2m5s
The above output shows that the pods start terminating from my-stateful-set-5 to my-stateful-set-1 to scale the replica down to 1.
Delete the StatefulSet, Service, StorageClass and PersistentVolumeClaims using
kubectl delete commands in that order. You can view the process while deleting the StatefulSet to see the order in which the Pods terminate.
$ kubectl delete statefulset my-statefulset NAME READY STATUS RESTARTS AGE my-stateful-set-0 1/1 Running 0 4h59m my-stateful-set-1 1/1 Running 0 23m my-stateful-set-2 1/1 Running 0 23m my-stateful-set-3 1/1 Running 0 23m my-stateful-set-4 1/1 Running 0 15m my-stateful-set-5 1/1 Running 0 15m my-stateful-set-5 1/1 Terminating 0 16m my-stateful-set-2 1/1 Terminating 0 25m my-stateful-set-0 1/1 Terminating 0 5h1m my-stateful-set-3 1/1 Terminating 0 24m my-stateful-set-1 1/1 Terminating 0 25m my-stateful-set-4 1/1 Terminating 0 17m my-stateful-set-2 0/1 Terminating 0 25m my-stateful-set-1 0/1 Terminating 0 25m my-stateful-set-5 0/1 Terminating 0 16m my-stateful-set-3 0/1 Terminating 0 25m my-stateful-set-0 0/1 Terminating 0 5h1m my-stateful-set-4 0/1 Terminating 0 17m my-stateful-set-4 0/1 Terminating 0 17m my-stateful-set-5 0/1 Terminating 0 16m my-stateful-set-5 0/1 Terminating 0 16m my-stateful-set-3 0/1 Terminating 0 25m my-stateful-set-3 0/1 Terminating 0 25m
The output shows that the Pods are terminated concurrently without waiting for any others to complete the process.
Check the Pods status to see if they have been deleted:
$ kubectl get pods No resources found in default namespace.
There are some limitations with StatefulSets when compared to Deployment. There is the possibility that the Pods remain unterminated. So, you can scale down the StatefulSet replicas to 0 first and then delete the StatefulSet. Also, the StatefulSet’s volumes remain intact for data preservation until its PersistentVolumeClaims are deleted. The PersistentVolumeClaims, as well as other components, must be deleted manually.