Docker & Kubernetes : Persistent Volumes & Persistent Volumes Claims - hostPath and annotations
A Container's file system lives only as long as the Container does. So when a Container terminates and restarts, filesystem changes are lost. A Kubernetes volume, unlike the volume in Docker, has an explicit lifetime - the same as the Pod that encloses it.
Consequently, a volume outlives any Containers that run within the Pod, and data is preserved across Container restarts. Of course, when a Pod ceases to exist, the volume will cease to exist, too.
However, Kubernetes supports many types of volumes, and a Pod can use any number of them simultaneously. For more consistent storage that is independent of the Container, we can use a Volume. This is especially important for stateful applications, such as key-value stores (such as Redis) and databases.
To use a volume, a Pod specifies what volumes to provide for the Pod (the .spec.volumes
field) and where to mount those into Containers (the .spec.containers.volumeMounts
field) as shown below (busybox.yaml):
kind: Pod apiVersion: v1 metadata: name: busybox-pod spec: volumes: - name: cache containers: - name: bysybox-container image: busybox command: ['sleep', '3600'] volumeMounts: - mountPath: "/cache" name: cache
We can check the mounted volume, cache:
$ kubectl create -f busybox.yaml pod/busybox-pod created $ kubectl get pods NAME READY STATUS RESTARTS AGE busybox-pod 1/1 Running 0 18s $ kubectl exec -it busybox-pod sh / # ls -la total 48 drwxr-xr-x 1 root root 4096 Mar 25 17:35 . drwxr-xr-x 1 root root 4096 Mar 25 17:35 .. -rwxr-xr-x 1 root root 0 Mar 25 17:35 .dockerenv drwxr-xr-x 2 root root 12288 Feb 14 18:58 bin drwxrwxrwx 2 root root 4096 Mar 25 17:35 cache ...
Here, we created a Pod that runs one Container. This Pod has a Volume of type emptyDir that lasts for the life of the Pod, even if the Container terminates and restarts.
A process in a container sees a filesystem view composed from their Docker image and volumes. The Docker image is at the root of the filesystem hierarchy, and any volumes are mounted at the specified paths within the image.
Each Container in the Pod must independently specify where to mount each volume.
In this section, we'll create a Pod that runs one Container. This Pod has a Volume of type emptyDir that lasts for the life of the Pod, even if the Container terminates and restarts. Here is the configuration file for the Pod (redis.yaml):
apiVersion: v1 kind: Pod metadata: name: redis spec: containers: - name: redis image: redis volumeMounts: - name: redis-storage mountPath: /data/redis volumes: - name: redis-storage emptyDir: {}
We can check the mounted volume, cache:
$ kubectl create -f redis.yaml pod/redis created $ kubectl get pod redis --watch NAME READY STATUS RESTARTS AGE redis 1/1 Running 0 33s
In another terminal, get a shell to the running Container and create a file:
$ kubectl exec -it redis -- /bin/bash # cd /data/redis/ root@redis:/data/redis# echo Hello > test-file root@redis:/data/redis# ls test-file
In the shell, list the running processes:
root@redis:/data/redis# apt-get update root@redis:/data/redis# apt-get install procps root@redis:/data/redis# ps aux USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND redis 1 0.1 0.1 50288 3068 ? Ssl 19:27 0:00 redis-server *: root 15 0.0 0.1 18136 2100 pts/0 Ss 19:29 0:00 /bin/bash root 275 0.0 0.1 36636 2756 pts/0 R+ 19:32 0:00 ps aux
In our shell, kill the Redis process:
root@redis:/data/redis# kill 1 root@redis:/data/redis# command terminated with exit code 137
In our original terminal, watch for changes to the Redis Pod. Eventually, we will see something like this:
$ kubectl get pod redis --watch NAME READY STATUS RESTARTS AGE redis 1/1 Running 0 33s redis 0/1 Completed 0 8m50s redis 1/1 Running 1 8m53s
At this point, the Container has terminated and restarted. This is because the Redis Pod has a "restartPolicy" of "Always".
Let's get into a shell of the restarted Container:
kubectl exec -it redis -- /bin/bash root@redis:/data# ls /data/redis test-file
We can see that test-file we created before the container restarted is still there.
Note that in addition to the local disk storage provided by "emptyDir", Kubernetes supports many different network-attached storage solutions, including PD on GCE and EBS on EC2, which are preferred for critical data and will handle details such as mounting and unmounting the devices on the nodes.
In the following sections, we'll learn how to configure a Pod to use a PersistentVolumeClaim for storage. Here is a summary of the process:
- A cluster administrator creates a PersistentVolume that is backed by physical storage. The administrator does not associate the volume with any Pod.
- A cluster user creates a PersistentVolumeClaim, which gets automatically bound to a suitable PersistentVolume.
- The user creates a Pod that uses the PersistentVolumeClaim as storage.
In this section, we'll create a hostPath PersistentVolume. Kubernetes supports hostPath for development and testing on a single-node cluster. A hostPath PersistentVolume uses a file or directory on the Node to emulate network-attached storage.
In a production cluster, however, we would not use hostPath. Instead a cluster administrator would provision a network resource like a Google Compute Engine persistent disk, an NFS share, or an Amazon Elastic Block Store volume. Cluster administrators can also use StorageClasses to set up dynamic provisioning.
Before we move on, we want to create an index.html file on our Node (minikube).
Let's open a shell on minikube and create the file:
$ minikube ssh _ _ _ _ ( ) ( ) ___ ___ (_) ___ (_)| |/') _ _ | |_ __ /' _ ` _ `\| |/' _ `\| || , < ( ) ( )| '_`\ /'__`\ | ( ) ( ) || || ( ) || || |\`\ | (_) || |_) )( ___/ (_) (_) (_)(_)(_) (_)(_)(_) (_)`\___/'(_,__/'`\____) $ sudo mkdir /mnt/data $ echo 'Hello from Kubernetes storage' | sudo tee -a /mnt/data/index.html Hello from Kubernetes storage
Here is the configuration file for the hostPath PersistentVolume (pv-volume.yaml):
kind: PersistentVolume apiVersion: v1 metadata: name: task-pv-volume labels: type: local spec: storageClassName: manual capacity: storage: 10Gi accessModes: - ReadWriteOnce hostPath: path: "/mnt/data"
The configuration file specifies that the volume is at /mnt/data on the cluster's Node. The configuration also specifies a size of 10 gibibytes and an access mode of ReadWriteOnce, which means the volume can be mounted as read-write by a single Node. It defines the StorageClass name manual for the PersistentVolume, which will be used to bind PersistentVolumeClaim requests to this PersistentVolume.
Let's create the PersistentVolume and view it:
$ kubectl create -f pv-volume.yaml persistentvolume/task-pv-volume created $ kubectl get pv task-pv-volume NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE task-pv-volume 10Gi RWO Retain Available manual 24s
The output indicates that the PersistentVolume has a STATUS of Available. This means it has not yet been bound to a PersistentVolumeClaim.
The next step is to create a PersistentVolumeClaim. Pods use PersistentVolumeClaims to request physical storage. In this section, we create a PersistentVolumeClaim that requests a volume of at least three gibibytes that can provide read-write access for at least one Node.
Here is the configuration file for the PersistentVolumeClaim (pv-claim.yaml):
kind: PersistentVolumeClaim apiVersion: v1 metadata: name: task-pv-claim spec: storageClassName: manual accessModes: - ReadWriteOnce resources: requests: storage: 3Gi
Let's create the PersistentVolumeClaim:
$ kubectl create -f pv-claim.yaml persistentvolumeclaim/task-pv-claim created
After we created the PersistentVolumeClaim, the Kubernetes control plane looks for a PersistentVolume that satisfies the claim's requirements. If the control plane finds a suitable PersistentVolume with the same StorageClass, it binds the claim to the volume.
Look again at the PersistentVolume:
$ kubectl get pv task-pv-volume NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE task-pv-volume 10Gi RWO Retain Bound default/task-pv-claim manual 9m
Now the output shows a STATUS of Bound!
Then, let's look at the PersistentVolumeClaim:
$ kubectl get pvc task-pv-claim NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE task-pv-claim Bound task-pv-volume 10Gi RWO manual 4m
Now it's time to create a Pod that uses our PersistentVolumeClaim as a volume. The configuration file for the Pod looks like this (pv-pod.yaml):
kind: Pod apiVersion: v1 metadata: name: task-pv-pod spec: volumes: - name: task-pv-storage persistentVolumeClaim: claimName: task-pv-claim containers: - name: task-pv-container image: nginx ports: - containerPort: 80 name: "http-server" volumeMounts: - mountPath: "/usr/share/nginx/html" name: task-pv-storage
One thing to notice is that the Pod's configuration file specifies a PersistentVolumeClaim, but it does not specify a PersistentVolume. That's because from the Pod's point of view, the claim is a volume.
Let's create the Pod:
$ kubectl create -f pv-pod.yaml pod/task-pv-pod created $ kubectl get pod task-pv-pod NAME READY STATUS RESTARTS AGE task-pv-pod 1/1 Running 0 59s
Get a shell to the Container running in our Pod and check if nginx is serving the index.html file from the hostPath volume:
$ kubectl exec -it task-pv-pod -- /bin/bash root@task-pv-pod:/# apt-get update root@task-pv-pod:/# apt-get install curl root@task-pv-pod:/# curl localhost Hello from Kubernetes storage
The output shows the text that we wrote to the index.html file on the hostPath volume!
Storage configured with a group ID (GID) allows writing only by Pods using the same GID. Mismatched or missing GIDs cause permission denied errors. To reduce the need for coordination with users, an administrator can annotate a PersistentVolume with a GID. Then the GID is automatically added to any Pod that uses the PersistentVolume.
kind: PersistentVolume apiVersion: v1 metadata: name: task-pv-volume labels: type: local annotations: pv.beta.kubernetes.io/gid: "1001" spec: storageClassName: manual capacity: storage: 10Gi accessModes: - ReadWriteOnce hostPath: path: "/mnt/data"
So, when a Pod consumes a PersistentVolume that has a GID annotation, the annotated GID is applied to all Containers in the Pod in the same way that GIDs specified in the Pod's security context are. Every GID, whether it originates from a PersistentVolume annotation or the Pod’s specification, is applied to the first process run in each Container.
Docker & K8s
- Docker install on Amazon Linux AMI
- Docker install on EC2 Ubuntu 14.04
- Docker container vs Virtual Machine
- Docker install on Ubuntu 14.04
- Docker Hello World Application
- Nginx image - share/copy files, Dockerfile
- Working with Docker images : brief introduction
- Docker image and container via docker commands (search, pull, run, ps, restart, attach, and rm)
- More on docker run command (docker run -it, docker run --rm, etc.)
- Docker Networks - Bridge Driver Network
- Docker Persistent Storage
- File sharing between host and container (docker run -d -p -v)
- Linking containers and volume for datastore
- Dockerfile - Build Docker images automatically I - FROM, MAINTAINER, and build context
- Dockerfile - Build Docker images automatically II - revisiting FROM, MAINTAINER, build context, and caching
- Dockerfile - Build Docker images automatically III - RUN
- Dockerfile - Build Docker images automatically IV - CMD
- Dockerfile - Build Docker images automatically V - WORKDIR, ENV, ADD, and ENTRYPOINT
- Docker - Apache Tomcat
- Docker - NodeJS
- Docker - NodeJS with hostname
- Docker Compose - NodeJS with MongoDB
- Docker - Prometheus and Grafana with Docker-compose
- Docker - StatsD/Graphite/Grafana
- Docker - Deploying a Java EE JBoss/WildFly Application on AWS Elastic Beanstalk Using Docker Containers
- Docker : NodeJS with GCP Kubernetes Engine
- Docker : Jenkins Multibranch Pipeline with Jenkinsfile and Github
- Docker : Jenkins Master and Slave
- Docker - ELK : ElasticSearch, Logstash, and Kibana
- Docker - ELK 7.6 : Elasticsearch on Centos 7
- Docker - ELK 7.6 : Filebeat on Centos 7
- Docker - ELK 7.6 : Logstash on Centos 7
- Docker - ELK 7.6 : Kibana on Centos 7
- Docker - ELK 7.6 : Elastic Stack with Docker Compose
- Docker - Deploy Elastic Cloud on Kubernetes (ECK) via Elasticsearch operator on minikube
- Docker - Deploy Elastic Stack via Helm on minikube
- Docker Compose - A gentle introduction with WordPress
- Docker Compose - MySQL
- MEAN Stack app on Docker containers : micro services
- MEAN Stack app on Docker containers : micro services via docker-compose
- Docker Compose - Hashicorp's Vault and Consul Part A (install vault, unsealing, static secrets, and policies)
- Docker Compose - Hashicorp's Vault and Consul Part B (EaaS, dynamic secrets, leases, and revocation)
- Docker Compose - Hashicorp's Vault and Consul Part C (Consul)
- Docker Compose with two containers - Flask REST API service container and an Apache server container
- Docker compose : Nginx reverse proxy with multiple containers
- Docker & Kubernetes : Envoy - Getting started
- Docker & Kubernetes : Envoy - Front Proxy
- Docker & Kubernetes : Ambassador - Envoy API Gateway on Kubernetes
- Docker Packer
- Docker Cheat Sheet
- Docker Q & A #1
- Kubernetes Q & A - Part I
- Kubernetes Q & A - Part II
- Docker - Run a React app in a docker
- Docker - Run a React app in a docker II (snapshot app with nginx)
- Docker - NodeJS and MySQL app with React in a docker
- Docker - Step by Step NodeJS and MySQL app with React - I
- Installing LAMP via puppet on Docker
- Docker install via Puppet
- Nginx Docker install via Ansible
- Apache Hadoop CDH 5.8 Install with QuickStarts Docker
- Docker - Deploying Flask app to ECS
- Docker Compose - Deploying WordPress to AWS
- Docker - WordPress Deploy to ECS with Docker-Compose (ECS-CLI EC2 type)
- Docker - WordPress Deploy to ECS with Docker-Compose (ECS-CLI Fargate type)
- Docker - ECS Fargate
- Docker - AWS ECS service discovery with Flask and Redis
- Docker & Kubernetes : minikube
- Docker & Kubernetes 2 : minikube Django with Postgres - persistent volume
- Docker & Kubernetes 3 : minikube Django with Redis and Celery
- Docker & Kubernetes 4 : Django with RDS via AWS Kops
- Docker & Kubernetes : Kops on AWS
- Docker & Kubernetes : Ingress controller on AWS with Kops
- Docker & Kubernetes : HashiCorp's Vault and Consul on minikube
- Docker & Kubernetes : HashiCorp's Vault and Consul - Auto-unseal using Transit Secrets Engine
- Docker & Kubernetes : Persistent Volumes & Persistent Volumes Claims - hostPath and annotations
- Docker & Kubernetes : Persistent Volumes - Dynamic volume provisioning
- Docker & Kubernetes : DaemonSet
- Docker & Kubernetes : Secrets
- Docker & Kubernetes : kubectl command
- Docker & Kubernetes : Assign a Kubernetes Pod to a particular node in a Kubernetes cluster
- Docker & Kubernetes : Configure a Pod to Use a ConfigMap
- AWS : EKS (Elastic Container Service for Kubernetes)
- Docker & Kubernetes : Run a React app in a minikube
- Docker & Kubernetes : Minikube install on AWS EC2
- Docker & Kubernetes : Cassandra with a StatefulSet
- Docker & Kubernetes : Terraform and AWS EKS
- Docker & Kubernetes : Pods and Service definitions
- Docker & Kubernetes : Service IP and the Service Type
- Docker & Kubernetes : Kubernetes DNS with Pods and Services
- Docker & Kubernetes : Headless service and discovering pods
- Docker & Kubernetes : Scaling and Updating application
- Docker & Kubernetes : Horizontal pod autoscaler on minikubes
- Docker & Kubernetes : From a monolithic app to micro services on GCP Kubernetes
- Docker & Kubernetes : Rolling updates
- Docker & Kubernetes : Deployments to GKE (Rolling update, Canary and Blue-green deployments)
- Docker & Kubernetes : Slack Chat Bot with NodeJS on GCP Kubernetes
- Docker & Kubernetes : Continuous Delivery with Jenkins Multibranch Pipeline for Dev, Canary, and Production Environments on GCP Kubernetes
- Docker & Kubernetes : NodePort vs LoadBalancer vs Ingress
- Docker & Kubernetes : MongoDB / MongoExpress on Minikube
- Docker & Kubernetes : Load Testing with Locust on GCP Kubernetes
- Docker & Kubernetes : MongoDB with StatefulSets on GCP Kubernetes Engine
- Docker & Kubernetes : Nginx Ingress Controller on Minikube
- Docker & Kubernetes : Setting up Ingress with NGINX Controller on Minikube (Mac)
- Docker & Kubernetes : Nginx Ingress Controller for Dashboard service on Minikube
- Docker & Kubernetes : Nginx Ingress Controller on GCP Kubernetes
- Docker & Kubernetes : Kubernetes Ingress with AWS ALB Ingress Controller in EKS
- Docker & Kubernetes : Setting up a private cluster on GCP Kubernetes
- Docker & Kubernetes : Kubernetes Namespaces (default, kube-public, kube-system) and switching namespaces (kubens)
- Docker & Kubernetes : StatefulSets on minikube
- Docker & Kubernetes : RBAC
- Docker & Kubernetes Service Account, RBAC, and IAM
- Docker & Kubernetes - Kubernetes Service Account, RBAC, IAM with EKS ALB, Part 1
- Docker & Kubernetes : Helm Chart
- Docker & Kubernetes : My first Helm deploy
- Docker & Kubernetes : Readiness and Liveness Probes
- Docker & Kubernetes : Helm chart repository with Github pages
- Docker & Kubernetes : Deploying WordPress and MariaDB with Ingress to Minikube using Helm Chart
- Docker & Kubernetes : Deploying WordPress and MariaDB to AWS using Helm 2 Chart
- Docker & Kubernetes : Deploying WordPress and MariaDB to AWS using Helm 3 Chart
- Docker & Kubernetes : Helm Chart for Node/Express and MySQL with Ingress
- Docker & Kubernetes : Deploy Prometheus and Grafana using Helm and Prometheus Operator - Monitoring Kubernetes node resources out of the box
- Docker & Kubernetes : Deploy Prometheus and Grafana using kube-prometheus-stack Helm Chart
- Docker & Kubernetes : Istio (service mesh) sidecar proxy on GCP Kubernetes
- Docker & Kubernetes : Istio on EKS
- Docker & Kubernetes : Istio on Minikube with AWS EC2 for Bookinfo Application
- Docker & Kubernetes : Deploying .NET Core app to Kubernetes Engine and configuring its traffic managed by Istio (Part I)
- Docker & Kubernetes : Deploying .NET Core app to Kubernetes Engine and configuring its traffic managed by Istio (Part II - Prometheus, Grafana, pin a service, split traffic, and inject faults)
- Docker & Kubernetes : Helm Package Manager with MySQL on GCP Kubernetes Engine
- Docker & Kubernetes : Deploying Memcached on Kubernetes Engine
- Docker & Kubernetes : EKS Control Plane (API server) Metrics with Prometheus
- Docker & Kubernetes : Spinnaker on EKS with Halyard
- Docker & Kubernetes : Continuous Delivery Pipelines with Spinnaker and Kubernetes Engine
- Docker & Kubernetes : Multi-node Local Kubernetes cluster : Kubeadm-dind (docker-in-docker)
- Docker & Kubernetes : Multi-node Local Kubernetes cluster : Kubeadm-kind (k8s-in-docker)
- Docker & Kubernetes : nodeSelector, nodeAffinity, taints/tolerations, pod affinity and anti-affinity - Assigning Pods to Nodes
- Docker & Kubernetes : Jenkins-X on EKS
- Docker & Kubernetes : ArgoCD App of Apps with Heml on Kubernetes
- Docker & Kubernetes : ArgoCD on Kubernetes cluster
- Docker & Kubernetes : GitOps with ArgoCD for Continuous Delivery to Kubernetes clusters (minikube) - guestbook
Ph.D. / Golden Gate Ave, San Francisco / Seoul National Univ / Carnegie Mellon / UC Berkeley / DevOps / Deep Learning / Visualization