I am deploying a Django app with Celery workers on AWS EKS. I have everything running as expected, except that K8s proceeds to stopping Celery workers replicas before finishing ongoing tasks, I also have the same behavior when making a new deployment or pushing new code to the master branch.
What I have tried:
more information: I have celery setup with Redis as a messaging broker and a result backend. After some research I started considering using Keda, but upon reading the docs, seems like it will only allow me to scale Celery pods based on queues length but doesn't give the kill mechanism I am looking for.
Is there any workaround to solve this issue?
I ended up setting a very big grace period; 5 hours