I've created and pushed a cron job to deployment, but when I see it running in OpenShift, I get the following error message:
Cannot determine if job needs to be started: Too many missed start time (> 100). Set or decrease .spec.startingDeadlineSeconds or check clock skew.
From what I understand by this, is that a job failed to run. But I don't understand why it is failing. Why isn't that logged somewhere? - or if it is, where can I find it?
The CronJob controller will keep trying to start a job according to the most recent schedule, but keeps failing and obviously it has done so >100 times.
I've checked the syntax of my cron job, which doesn't give any errors. Also if there are any syntax messages, I'm not even allowed to push.
Anyone know what's wrong?
my Cron Job:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: my-cjob
labels:
job-name: my-cjob
spec:
schedule: "*/5 * * * *"
# activeDeadlineSeconds: 180 # 3 min <<- should this help and why?
jobTemplate:
spec:
template:
metadata:
name: my-cjob
labels:
job-name: my-cjob
spec:
containers:
- name: my-cjob
image: my-image-name
restartPolicy: OnFailure
Or should I be using startingDeadlineSeconds
? Anyone who has hit this error message and found a solution?
Update as according to comment
When running kubectl get cronjob
I get the following:
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
my-cjob */5 * * * * False 0 <none> 2d
When running kubectl logs my-cjob
I get the following:
Error from server (NotFound): pods "my-cjob" not found
When running kubectl describe cronjob my-cjob
I get the following:
Error from server (NotFound): the server could not find the requested resource
When running kubectl logs <cronjob-pod-name>
I get many lines o code... Very difficult for me to understand and sort out..
When running kubectl describe pod <cronjob-pod-name>
I also get a lot, but this is way easier to sort. Anything specific?
Running kubectl get events
I get a lot, but I think this is the related one:
LAST SEEN FIRST SEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE
1h 1h 2 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Pod spec.containers{apiproxy} Warning Unhealthy kubelet, xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Liveness probe failed: Get http://xxxx/xxxx: dial tcp xxxx:8080: connect: connection refused
Setting the startingDeadlineSeconds to 180 fixed the problem + removing the spec.template.metadata.labels.