How does a StatefulSet ensure that a PersistentVolume and Pod will always be provisioned in the same Availability Zone? I understand that each pod in a StatefulSet has a storage identity, and that each pod will remember the PVC it is using, but am struggling to find the official documentation to support this.
Deep inside Kubernetes, the volume driver has the ability to tell the cluster what constraints it has on the pod placement.
If you're running this in AWS/EKS, your actual nodes are probably EC2 instances and the default volume type is an EBS volume, that must be attached to a single node. So when a Pod is created, the cluster needs to mount the EBS volume on the Node's EC2 instance, and for that to happen the EC2 instance and the EBS volume need to be in the same AZ.
One place to see this actually written out is in the Container Storage Interface specification. The CreateVolume
RPC returns a lot of data, but one part of this is a TopologyRequirement
sub-object. You can see in that object's comments how the CSI driver can indicate a specific region/AZ in which the volume exists and so the target node must be too.
StatefulSets don't do anything special in this space; they just create PersistentVolumeClaims and Pods with those PVCs mounted as volumes. They let the cluster allocate the underlying PersistentVolumes and resolve the corresponding placement contraints.