Info:
Details: I have installed statefulset in our kubernetes cluster, however, it stuck it "ContainerCreating" status. Upon checking the logs, the error is "AttachVolume.Attach failed for volume pvc-xxxxxx: error finding instance ip-xxxxx : "instance not found"
It was succesfully installed around 17 days ago, but re-installing for an update caused the pod to stuck in ContainerCreating.
Manual attaching volume to the instance works. But doing it via storage class is not working and stuck in ContainerCreating status.
storageclass:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
annotations:
storageclass.kubernetes.io/is-default-class: "true"
name: ssd-default
allowVolumeExpansion: true
parameters:
encrypted: "true"
type: gp2
provisioner: kubernetes.io/aws-ebs
reclaimPolicy: Delete
volumeBindingMode: Immediate
pvc yaml:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
pv.kubernetes.io/bind-completed: "yes"
pv.kubernetes.io/bound-by-controller: "yes"
volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/aws-ebs
finalizers:
- kubernetes.io/pvc-protection
labels:
app.kubernetes.io/instance: thanos-store
app.kubernetes.io/name: thanos-store
name: data-thanos-store-0
namespace: thanos
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
storageClassName: ssd-default
volumeMode: Filesystem
volumeName: pvc-xxxxxx
status:
accessModes:
- ReadWriteOnce
capacity:
storage: 3Gi
phase: Bound
pv yaml:
apiVersion: v1
kind: PersistentVolume
metadata:
annotations:
kubernetes.io/createdby: aws-ebs-dynamic-provisioner
pv.kubernetes.io/bound-by-controller: "yes"
pv.kubernetes.io/provisioned-by: kubernetes.io/aws-ebs
finalizers:
- kubernetes.io/pv-protection
labels:
failure-domain.beta.kubernetes.io/region: ap-xxx
failure-domain.beta.kubernetes.io/zone: ap-xxx
name: pvc-xxxx
spec:
accessModes:
- ReadWriteOnce
awsElasticBlockStore:
fsType: ext4
volumeID: aws://xxxxx
capacity:
storage: 3Gi
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: data-athena-thanos-store-0
namespace: thanos
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: failure-domain.beta.kubernetes.io/region
operator: In
values:
- ap-xxx
- key: failure-domain.beta.kubernetes.io/zone
operator: In
values:
- ap-xxx
persistentVolumeReclaimPolicy: Delete
storageClassName: ssd-default
volumeMode: Filesystem
status:
phase: Bound
Describe pvc:
Name: data-athena-thanos-store-0
Namespace: athena-thanos
StorageClass: ssd-encrypted
Status: Bound
Volume: pvc-xxxx
Labels: app.kubernetes.io/instance=athena-thanos-store
app.kubernetes.io/name=athena-thanos-store
Annotations: pv.kubernetes.io/bind-completed: yes
pv.kubernetes.io/bound-by-controller: yes
volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/aws-ebs
Finalizers: [kubernetes.io/pvc-protection]
Capacity: 3Gi
Access Modes: RWO
VolumeMode: Filesystem
Mounted By: athena-thanos-store-0
The FailedAttachVolume
error occurs when an EBS volume can’t be detached from an instance and thus cannot be attached to another. The EBS volume has to be in the available state to be attached. FailedAttachVolume
is usually a symptom of an underlying failure to unmount and detach the volume.
Notice that while describing the PVC the StorageClass
name is ssd-encrypted
which is a mismatch with the config you showed earlier where the kind: StorageClass
name is ssd-default
. That's why you can mount the volume manually but not via the StorageClass
. You can drop and recreate the StorageClass
with a proper data.
Also, I recommend going through this article and using volumeBindingMode: WaitForFirstConsumer
instead of volumeBindingMode: Immediate
. This setting instructs the volume provisioner to not create a volume immediately, and instead, wait for a pod using an associated PVC to run through scheduling.