I'm pretty stuck with this learning step of Kubernetes named PV
and PVC
.
What I'm trying to do here is understand how to handle shared read-write volume on multiple pods.
What I understood here is that a PVC
cannot be shared between pods unless a NFS-like storage class has been configured.
I'm still with my hostPath
Storage Class and I tried the following (Docker Desktop and 3 nodes microK8s cluster) :
PVC
with dynamic Host Path provisionningapiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-desktop
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Mi
PVC
.apiVersion: apps/v1
kind: Deployment
metadata:
name: busybox
spec:
replicas: 3
selector:
matchLabels:
app: busybox
template:
metadata:
labels:
app: busybox
spec:
containers:
- name: busybox
image: library/busybox:stable
command: ["/bin/sh"]
args:
["-c", 'while true; do echo "1: $(hostname)" >> /root/index.html; sleep 2; done;',]
volumeMounts:
- mountPath: /root
name: vol-desktop
volumes:
- name: vol-desktop
persistentVolumeClaim:
claimName: pvc-desktop
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:stable
volumeMounts:
- mountPath: /usr/share/nginx/html
name: vol-desktop
ports:
- containerPort: 80
volumes:
- name: vol-desktop
persistentVolumeClaim:
claimName: pvc-desktop
Following what I understood on the documentation, this could not be possible, but in reality everything run pretty smoothly and my Nginx server displayed the up to date index.html
file pretty well.
It actually worked on a single-node cluster and multi-node cluster.
What am I not getting here? Why this thing works?
Is every pod mounting is own host path volume on start?
How can a hostPath storage works between multiple nodes?
EDIT: For the multi-node case, a network folder has been created between the same storage path of each machine this is why everything has been replicated successfully. I didn't understand that the same host path is created on each node with that PVC
mounted.
To anyone with the same problem: each node mounting this hostpath PVC
will have is own folder created at the PV
path.
So without network replication between nodes, only pods of the same node will share the same folder.
This is why it's discouraged on a multi-node cluster due to the unpredictable location of a pod on the cluster.
Thanks!
how to handle shared read-write volume on multiple pods.
Redesign your application to avoid it. It tends to be fragile and difficult to manage multiple writers safely; you depend on both your application correctly performing things like file locking, the underlying shared filesystem implementation handling things properly, and the system being tolerant of any sort of network hiccup that might happen.
The example you give is something that frequently appears in Docker Compose setups: have an application with a mix of backend code and static files, and then try to publish the static files at runtime through a volume to a reverse proxy. Instead, you can build an image that copies the static files at build time:
FROM nginx
ARG app_version=latest
COPY --from=my/app:${app_version} /app/static /usr/share/nginx/html
Have your CI system build this and push it immediately after the backend image is built. The resulting image serves the corresponding static files, but doesn't require a shared volume or any manual management of the volume contents.
For other types of content, consider storing data in a database, or use an object-storage service that maintains its own backing store and can handle the concurrency considerations. Then most of your pods can be totally stateless, and you can manage the data separately (maybe even outside Kubernetes).
How can a hostPath storage works between multiple nodes?
It doesn't. It's an instruction to Kubernetes, on whichever node the pod happens to be scheduled on, to mount that host directory into the container. There's no management of any sort of the directory content; if two pods get scheduled on the same node, they'll share the directory, and if not, they won't; and if your pod's Deployment is updated and the pod is deleted and recreated somewhere else, it might not be the same node and might not have the same data.
With some very specific exceptions you shouldn't use hostPath
volumes at all. The exceptions are things like log collectors run as DaemonSets, where there is exactly one pod on every node and you're interested in picking up the host-directory content that is different on each node.
In your specific setup either you're getting lucky with where the data producers and consumers are getting colocated, or there's something about your MicroK8s setup that's causing the host directories to be shared. It is not in general reliable storage.