Search code examples
kuberneteskubernetes-pod

Auto delete CrashBackoffLoop pods in a deployment


In my kubernetes cluster, there are multiple deployments in a namespace. For a specific deployment, there is a need to not allow "CrashLoopBackoff" pods to exist. So basically, when any pod gets to this state, I would want it to be deleted and later a new pod to be created which is already handled by the ReplicaSet.

I tried with custom controllers, with the thought that the SharedInformer would alert about the state of Pod and then I would delete it from that loop. However, this brings dependency on the pod on which the custom controller would run.

I also tried searching for any option to be configured in the manifest itself, but could not find any.

I am pretty new to Kuberenetes, so need help in the implementation of this behaviour.


Solution

  • Firstly, you should address the reason why the pod has entered the CrashLoopBackOff state rather than just delete it. If you do this, you'll potentially just recreate the problem again and you'll be deleting pods repeatedly. For example, if your pod is trying to access an external DB and that DB is down, it'll CrashLoop, and deleting and restarting the pod won't help fix that.

    Secondly, if you want to do this deleting in an automated manner, an easy way would be to run a CronJob resource that goes through your deployment and deletes the CrashLooped pods. You could set the cronjob to run once an hour or whatever schedule you wish.