Search code examples
kubernetescronkubernetes-cronjobk8s-cronjobber

Kubernetes Cronjob: Reset missed start times after cluster recovery


I have a cluster that includes a Cronjob scheduled to run every 5 minutes.

We recently experienced an issue that incurred downtime and required manual recovery of the cluster. Although now healthy again, this particular cronjob is failing to run with the following error:

Cannot determine if job needs to be started: Too many missed start time (> 100). Set or decrease .spec.startingDeadlineSeconds or check clock skew.

I understand that the Cronjob has 'missed' a number of scheduled jobs while the cluster was down, and this has past a threshold at which no further jobs will be scheduled.

How can I reset the number of missed start times and have these jobs scheduled again (without scheduling all the missed jobs to suddenly run?)


Solution

  • Per the kubernetes Cronjob docs, there does not seem to be a way to cleanly resolve this. Setting the .spec.startingDeadlineSeconds value to a large number will re-schedule all missed occurrences that fall within the increased window.

    My solution was just to kubectl delete cronjob x-y-z and recreate it, which worked as desired.