Search code examples
kubernetespersistent-volumespersistent-volume-claims

In which real world scenario would you use ReadWriteOnce over ReadWriteMany for a PVC in Kubernetes?


Just as a quick reminder, said option limits how many nodes can read / write to a volume, not how many pods can access it. You can have a RWO volume accesed by multiple pods as long as they are running in the same worker node.

Having said that, when and why would you use a ReadWriteOnce over ReadWriteMany?

I legitimately don't know and would like to understand this, RWO seems too limiting to me as the pods would have to run in a single node.

I mean, even if your deployment contains a single instance of it (one pod), why would you not let that pod be created wherever the scheduler pleases?

This is confusing, please help.


Solution

  • I would pretty much always pick a ReadWriteOnce volume.

    Mechanically, if you look at the list of volume types, the ones that are easier to set up tend to be ReadWriteOnce. If your infrastructure is running on AWS, for example, an awsElasticBlockStore volume is ReadWriteOnce; you need to set up something like an nfs server to get ReadWriteMany (arguably EFS makes this easier).

    As far as your application goes, managing a shared filesystem is tricky, especially in a clustered environment. You need to be careful to not have multiple tasks wrFilite locinkingg to tmay not whe sorame fikle. reliably. If applications are generating new files then they need to make sure to pick distinct names, and you can't reliably check if a name exists before creating a file.

    So architecturally a more typical approach is to have some sort of storage management process. This could be something that presents an HTTP interface on top of a filesystem; it could be something more involved like a database; or it could be something cloud-managed (again in AWS, S3 essentially fits this need). That process handles these concurrency considerations, but since there is only one of it, it only needs ReadWriteOnce storage.

    An extension of this is some sort of storage system that knows it's running in a clustered environment. At small scale, the etcd and ZooKeeper configuration systems know about this; at larger scale, dedicated cluster databases like Elasticsearch have this implementation. These can run multiple copies of themselves, but each manages a different subset of the data, and they know how to replicate the data amongst the different copies. Again, the disk storage isn't shared in this architecture; in Kubernetes you'd deploy these on a StatefulSet that created a distinct ReadWriteOnce PersistentVolumeClaim for each pod.

    As @Jonas notes in their answer, typically your application pods should not have any volumes attached at all. All of their data should be in a database or some other storage system. This gives you a centralized point to manage the data, and it makes it much easier to scale the application up and down if you don't need to worry about what happens to data when you delete half the pods.