Search code examples
kubernetespersistent-storage

Is Kubernetes local/csi PV content synced into a new node?


According to the documentation:

A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned ... It is a resource in the cluster just like a node is a cluster resource...

So I was reading about all currently available plugins for PVs and I understand that for 3rd-party / out-of-cluster storage this doesn't matter (e.g. storing data in EBS, Azure or GCE disks) because there are no or very little implications when adding or removing nodes from a cluster. However, there are different ones such as (ignoring hostPath as that works only for single-node clusters):

  • csi
  • local

which (at least from what I've read in the docs) don't require 3rd-party vendors/software.

But also:

... local volumes are subject to the availability of the underlying node and are not suitable for all applications. If a node becomes unhealthy, then the local volume becomes inaccessible by the pod. The pod using this volume is unable to run. Applications using local volumes must be able to tolerate this reduced availability, as well as potential data loss, depending on the durability characteristics of the underlying disk.

The local PersistentVolume requires manual cleanup and deletion by the user if the external static provisioner is not used to manage the volume lifecycle.

Use-case

Let's say I have a single-node cluster with a single local PV and I want to add a new node to the cluster, so I have 2-node cluster (small numbers for simplicity).

Will the data from an already existing local PV be 1:1 replicated into the new node as in having one PV with 2 nodes of redundancy or is it strictly bound to the existing node only?

If the already existing PV can't be adjusted from 1 to 2 nodes, can a new PV (created from scratch) be created so it's 1:1 replicated between 2+ nodes on the cluster?

Alternatively if not, what would be the correct approach without using a 3rd-party out-of-cluster solution? Will using csi cause any change to the overall approach or is it the same with redundancy, just different "engine" under the hood?


Solution

  • Can a new PV be created so it's 1:1 replicated between 2+ nodes on the cluster?

    None of the standard volume types are replicated at all. If you can use a volume type that supports ReadWriteMany access (most readily NFS) then multiple pods can use it simultaneously, but you would have to run the matching NFS server.

    Of the volume types you reference:

    • hostPath is a directory on the node the pod happens to be running on. It's not a directory on any specific node, so if the pod gets recreated on a different node, it will refer to the same directory but on the new node, presumably with different content. Aside from basic test scenarios I'm not sure when a hostPath PersistentVolume would be useful.

    • local is a directory on a specific node, or at least following a node-affinity constraint. Kubernetes knows that not all storage can be mounted on every node, so this automatically constrains the pod to run on the node that has the directory (assuming the node still exists).

    • csi is an extremely generic extension mechanism, so that you can run storage drivers that aren't on the list you link to. There are some features that might be better supported by the CSI version of a storage backend than the in-tree version. (I'm familiar with AWS: the EBS CSI driver supports snapshots and resizing; the EFS CSI driver can dynamically provision NFS directories.)

    In the specific case of a local test cluster (say, using kind) using a local volume will constrain pods to run on the node that has the data, which is more robust than using a hostPath volume. It won't replicate the data, though, so if the node with the data is deleted, the data goes away with it.