I have a Python process that I want to fire up every n minutes in a Kubernetes cronjob and read a number of messages (say 5) from a queue, and then process/convert some files and run analysis on results based on these queue messages. If the process is still running after n minutes, I don't want to start a new process. In total, I would like a number of these (say 3) of these to be able to run at the same time, however, there can never be more than 3 processes running at the same time. To try and implement this, I tried the following (simplified):
apiVersion: batch/v1
kind: CronJob
metadata:
name: some-job
namespace: some-namespace
spec:
schedule: "*/5 * * * *"
concurrencyPolicy: "Forbid"
jobTemplate:
spec:
parallelism: 3
template:
spec:
containers:
- name: job
image: myimage:tag
imagePullPolicy: Always
command: ['python', 'src/run_job.py']
Now what this amounts to is a maximum of three processes running at the same time due to 'parallelism' being 3, and concurrencyPolicy being "Forbid", even if the processes go over the 5 minute mark.
The problem I specifically have is that one pod (e.g. pod 1) can take longer than the other two to finish, which means that pod 2 and 3 might finish after a minute, while pod one only finishes after 10 minutes due to processing of larger files from the queue.
Where I thought that parallelism: 3
would cause pod 2 and 3 to be deleted and replaced after finishing (when new cron interval hits), they are not and have to wait for pod 1 to finish before starting three new pods when the cron interval hits again.
When I think about it, this functionality makes sense given the specification and meaning of what a cronjob is. However, I would like to know if it would be able to have these pods/processes not be dependent on one another for restart without having to define a duplicate cronjob, all running one process.
Otherwise, maybe I would like to know if it's possible to easily launch more duplicate cronjobs without copying them into multiple manifests.
Duplicate cronjobs seems to be the way to achieve what you are looking for. Produce 3 duplicates with single job at a time. You could template the job manifest and produce multiple as in the following example. The example is not in your problem context, but you can get the idea. http://kubernetes.io/docs/tasks/job/parallel-processing-expansion