The State of Apps 4: PersistentVolumes and PersistentVolumeClaims

Kubernetes
June 14, 2021

Previously in this series, we looked at volumes and volumeMounts in Kubernetes. Now we’ll take a step further and introduce two other Kubernetes objects related to data persistence and preservation, namely PersistentVolumes (PVs) and PersistentVolumeClaims (PVCs).

We’ll cover the following topics, including hands-on practice on the functionalities of these concepts:

Kubernetes PersistentVolumes (PVs) and PersistentVolumeClaims (PVCs)

Storage management is essential in Kubernetes, especially in  large environments where many users deploy multiple Pods. The users in this environment often need to configure storage for each Pod, and when making a change to existing applications, it must be made on all Pods, one after the other. To mitigate this time-consuming scenario and separate the details of how storage is provisioned from how it is consumed, we use PersistentVolumes (PVs) and PersistentVolumeClaims (PVCs).

A PersistentVolume (PV) in Kubernetes is a pool of pre-provisioned storage resources in a Kubernetes cluster, that can be used across different user environments. Its lifecycle is separate from a Pod that uses the PersistentVolume.

A PersistentVolumeClaim (PVC), is a process of storage requests from PVs by the users in Kubernetes. Kubernetes binds PVs with the PVCs based on the request and property set on those PVs. Kubernetes searches for PVs that correspond to the PVCs’ requested capacity and specified properties, so that each PVC can bind to a single PV. 

When there are multiple matches, you can use labels and selectors to bind a PVC to the right or a particular PV. This helps guard against a situation where a small PVC binds to a larger PV, since PVs and PVCs have a one-to-one relationship. When this happens, the remaining storage in the bound PVs are inaccessible to other users.

NOTE: Both the PVCs and the Pod using them must be in the same namespace.

The Difference Between PVs and PVCs in Kubernetes

PVs and PVCs differ in provisioning, functionalities, and the person responsible for creating them, specifically :

PVs and PVCs architecture

Difference between Volumes and PersistentVolumes

Volumes and PersistentVolumes differ in the following ways:

PersistentVolumes and PersistentVolumeClaims Lifecycle

The communication between PVs and PVCs  consists of the following stages:

How to Create a PersistentVolume

The following steps will guide you through how to create a PersistentVolume and how to use it in a Pod. Before continuing, it is imperative to have a basic knowledge of volume, volumeMounts, and volume types such as hostPath, emptyDir, among others, in order to effectively follow this hands-on practice. A running Kubernetes cluster and a kubectl command-line tool must be configured to talk to the cluster. If you do not have this, you can achieve this by simply creating a Kubernetes cluster on any environment with KubeOne. Refer to Getting Started for instructions. Alternatively, you can go to the Kubernetes playground to practice. In that case, you might also need a cloud provider (GKE, AWS, etc.) access or credentials to provision storage.

Follow the below steps to create a PersistentVolume:

Step 1: Create the YAML file.

$ vim pv-config.yaml

Step 2: Copy and paste the below configuration file into the YAML manifest file created above.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-volume
spec:
  capacity:
    storage: 3Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/app/data"

The configuration above shows properties with different functionalities. In addition to the properties from the previous Kubernetes objects exercises, let’s look at accessModes, capacity, and storage properties:

Step 3: Create the Persistent Volume using kubectl create command.

$ kubectl create -f pv-config.yaml
persistentvolume/my-volume created

Step 4: Check the created PV to see if it is available.

$ kubectl get pv

NAME       CAPACITY   ACCESS MODES   RECLAIM POLICY  STATUS     CLAIM    STORAGECLASS      REASON      AGE 
my-volume    3Gi       RWO              Retain       Available                         ⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀ 60s

Step 5: Check the description of the PersistentVolume by running kubectl describe command.

$ kubectl describe pv my-volume

Labels:          <none>
Annotations:     <none>
Finalizers:      [kubernetes.io/pv-protection]
StorageClass:
Status:          Available
Claim:
Reclaim Policy:  Retain
Access Modes:    RWO
VolumeMode:      Filesystem
Capacity:        3Gi
Node Affinity:   <none>
Message:
Source:
    Type:          HostPath (bare host directory volume)
    Path:          /app/data
    HostPathType:
Events:            <none>

As seen above, the PersistentVolume is available and ready to serve a PVC storage request. The status will change from available to bound when it has been claimed.

NOTE: It is not advisable to use the hostpath volume type in a production environment.

How to Create a PersistentVolumeClaim in Kubernetes

Now that the PersistentVolume has been successfully created, the next step is to create a PVC that will claim the volume. Creating a PersistentVolumeClaim is similar to the method used to create the PersistentVolume above, with a few differences in terms of its properties and values. The value of the kind property will be PersistentVolumeClaim. The resources.request.storage field will also be added, and the value will be the provisioned PV capacity value (so 3Gi in our case).

The configuration will look like the manifest file below:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-claim
spec:
  accessModes:
    - ReadWriteOnce
   resources:
      requests:
        storage: 3Gi

Step 1: Create a YAML file.

$ vim pvc-config.yaml

Step 2: Copy and paste the above configuration file into the YAML file created above.

Step 3: Create the PVC by running kubectl create command.

$ kubectl create -f pvc-config.yaml
persistentvolumeclaim/my-claim created

Step 4: Check the status of the PVC by running kubectl get command.

$ kubectl get pvc my-claim

NAME       STATUS     VOLUME      CAPACITY   ACCESS MODES   STORAGECLASS   AGE
my-claim   Bound    my-volume       3Gi           RWO                      7m

Step 5: Check the description of the PVC.

$ kubectl describe pvc my-claim

Name:          my-claim
Namespace:     default
StorageClass: 
Status:        Bound
Volume:        my-volume
Labels:        <none>
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      3Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Mounted By:    <none>
Events:        <none>

As shown above, the status indicates a “Bound” state, which means the PVC and PV are now bound together. Now, check the status of the PV once again with kubectl get command.

$ kubectl get pv my-volume

NAME       CAPACITY   ACCESS MODES   RECLAIM POLICY  STATUS     CLAIM        ⠀⠀⠀STORAGECLASS    REASON     AGE 
my-volume    3Gi       RWO              Retain       Bound   default/my-claim                          ⠀⠀⠀ 11m

The output shows that the status has changed from “available” when it is not yet bound to “Bound” because the PV has been claimed by the created PVC.

Step 6: Delete the PVC using kubectl delete command.

$ kubectl delete pvc my-claim

persistentvolumeclaim "my-claim" deleted

Step 7: Check both the PVC and PV status with kubectl get command.

$ kubectl get pvc my-claim

No resources found in default namespace.
$ kubectl get pv my-volume

NAME       CAPACITY   ACCESS MODES   RECLAIM POLICY  STATUS       CLAIM        ⠀⠀⠀STORAGECLASS    REASON     AGE 
my-volume    3Gi       RWO              Retain       Released   default/my-claim                          ⠀⠀ 21m

The output shows that the PV status has now changed from a “Bound” to a “Released” state, after the PVC is deleted.

PersistentVolumes States

A PV has different states, can be in any of these, and each has its own meaning, described as follows:

PersistentVolumeClaims States

Each PVC, like the PV, has its own states that represent its current status.

How to Use PersistentVolumeClaim in a Pod

A Pod can access storage with the help of a PVC, which will be used as a volume. PVC can be used in a Pod by first declaring a “volumes” property in the Pod manifest file and specifying the claim name under the declared volume type “persistentVolumeClaim” property. It is essential that both the PVC and the Pod using it exist in the same namespace. This will allow the cluster to find the claim in the Pod’s namespace and use it to access the PersistentVolume that is bound to the PVC. Once created, the applications in the containers can read and write into the storage.The complete configuration file will look like this:

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
spec:
  containers:
  - name: stclass-test
    image: nginx
    volumeMounts:
    - mountPath: "/app/data"
      name: my-volume
  volumes:
  - name: my-volume
    persistentVolumeClaim:
      claimName: my-claim 

The below steps will guide you through how to use a claim in a Pod. The PV and PVC have to be provisioned before creating the Pod. The claim name inside the Pod must also match the claim name in the running PVC.

Step 1: Create a YAML file.

$ vim pvc-pod.yaml

Step 2: Copy and paste the above Pod manifest file into the YAML file created above and create the Pod with kubectl create command.

$ kubectl create -f pvc-pod.yaml

pod/my-pod created

Step 3: Check the status and the description of the Pod.

$ kubectl get pod my-pod

NAME      READY   STATUS    RESTARTS    AGE
my-pod     1/1    Running      0       6m50s
$ kubectl describe pod my-pod

Name:         my-pod
Namespace:    default
Priority:     0
Node:         node01/172.17.0.57
Start Time:   Tue, 01 Dec 2020 11:44:08 +0000
Labels:       <none>
Annotations:  <none>
Status:       Running
IP:           10.244.1.2
IPs:
  IP:  10.244.1.2
Volumes:
  my-volume:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  my-claim
    ReadOnly:   false
  default-token-nlmxj:
    Type:       Secret (a volume populated by a Secret)
    SecretName: default-token-nlmxj
    Optional:   false
QoS Class:      BestEffort

Step 4: Exec into the Pod to test the PVC use case.

$ kubectl exec -it my-pod -- bin/bash

root@my-pod:/# 

Step 5: Use df -h together with the path to confirm the mount point.

root@my-pod:/# df -h /app/data

Filesystem                   		Size  	 Used 	    Avail       Use% Mountedon 
/dev/mapper/host01--vg-root  	        191G      22G        159G        13% /app/data

Step 6: Change into the mount directory and create a file using echo command.

root@my-pod:/# cd /app/data
root@my-pod:/app/data# echo "I love Kubermatic" > file.txt

Step 7: Check the created file and data using ls and cat commands, then exit the Pod once this has been confirmed.

root@my-pod:/app/data# ls
file.txt
root@my-pod:/app/data# cat file.txt
"I love Kubermatic"
root@my-pod:/app/data# exit

Step 8: Delete and recreate the Pod.

$ kubectl delete pod my-pod
pod "my-pod" deleted
$ kubectl get pods

No resources found in default namespace.

Recreate the Pod and check the status:

$ kubectl create -f pvc-pod.yaml
pod/my-pod created

$ kubectl get pod my-pod

NAME     READY     STATUS    RESTARTS   AGE
my-pod    1/1      Running     0         5s

Step 9: Exec into the Pod and check the previous file and data created if they exist in the new Pod.

$ kubectl exec -it my-pod -- bin/bash
root@my-pod:/# df -h /app/data

Filesystem                     Size    Used    Avail    Use%   Mountedon 
/dev/mapper/host01--vg-root    191G     22G    159G     13%    /app/data

root@my-pod:/# cd /app/data
root@my-pod:/app/data# ls
file.txt
root@my-pod:/app/data# cat file.txt
"I love Kubermatic"

You can see that the file and data created in the deleted Pod above are still there and were taken up by the new Pod.

Summary: PersistentVolumes and PersistentVolumeClaims are Kubernetes objects that work in tandem to give your Kubernetes applications a higher level of persistence, either statically or dynamically, with the help of StorageClass.

In the next part of our series, we will look at another way to persist data in Kubernetes with StorageClass. You’ll see its functionalities in action, why it’s needed, and how it can be used in a Pod together with a claim.

We’d love to hear from you!  Please contact us with any thoughts or questions you might have about PersistentVolumes and PersistentVolumeClaims.

Learn More

Seyi Ewegbemi

Seyi Ewegbemi

Student Worker