Search code examples
kuberneteskubernetes-helm

kubectl drain not evicting helm memcached pods


I'm following this guide in an attempt to upgrade a kubernetes cluster on GKE with no downtime. I've gotten all the old nodes cordoned and most of the pods have been evicted, but for a couple of the nodes, kubectl drain just keeps running and not evicting any more pods.

kubectl get pods --all-namespaces -o=wide shows a handful of pods still running on the old pool, and when I run kubectl drain --ignore-daemonsets --force it prints a warning explaining why it's ignoring most of them; the only ones it doesn't mention are the pods I have running memcached, which were created via helm using this chart.

We don't rely too heavily on memcached, so I could just go ahead and delete the old node pool at this point and accept the brief downtime for that one service. But I'd prefer to have a script to do this whole thing the right way, and I wouldn't know what to do at this point if these pods were doing something more important.

So, is this expected behavior somehow? Is there something about that helm chart that's making these pods refuse to be evicted? Is there another force/ignore sort of flag I need to pass to kubectl drain?


Solution

  • The helm chart you linked contains a PodDisruptionBudget (PDB). kubectl drain will not remove pods if it would violate a PDB (reference: https://kubernetes.io/docs/concepts/workloads/pods/disruptions/, "How Disruption Budgets Work" section mentions this).

    If minAvailable on your PDB equals to number of replicas of your pod you will not be able to drain the node. Given that https://github.com/kubernetes/charts/blob/master/stable/memcached/values.yaml has both set to 3, I would guess that's most likely the source of your problem. Just set your PDB minAvailable to one less than the desired number of replicas and it will be able to move your pods one-by-one.