Search code examples
pythonpostgresqlmutexkubernetesdistributed-system

Ensuring at most a single instance of job executing on Kubernetes and writing into Postgresql


I have a Python program that I am running as a Job on a Kubernetes cluster every 2 hours. I also have a webserver that starts the job whenever user clicks a button on a page.

I need to ensure that at most only one instance of the Job is running on the cluster at any given time.

Given that I am using Kubernetes to run the job and connecting to Postgresql from within the job, the solution should somehow leverage these two. I though a bit about it and came with the following ideas:

  1. Find a setting in Kubernetes that would set this limit, attempts to start second instance would then fail. I was unable to find this setting.
  2. Create a shared lock, or mutex. Disadvantage is that if job crashes, I may not unlock before quitting.
    1. Kubernetes is running etcd, maybe I can use that
    2. Create a 'lock' table in Postgresql, when new instance connects, it checks if it is the only one running. Use transactions somehow so that one wins and proceeds, while others quit. I have not yet thought this out, but is should work.
  3. Query kubernetes API for a label I use on the job, see if there are some instances. This may not be atomic, so more than one instance may slip through.

What are the usual solutions to this problem given the platform choice I made? What should I do, so that I don't reinvent the wheel and have something reliable?


Solution

  • A completely different approach would be to run a (web) server that executes the job functionality. At a high level, the idea is that the webserver can contact this new job server to execute functionality. In addition, this new job server will have an internal cron to trigger the same functionality every 2 hours.

    There could be 2 approaches to implementing this:

    1. You can put the checking mechanism inside the jobserver code to ensure that even if 2 API calls happen simultaneously to the job server, only one executes, while the other waits. You could use the language platform's locking features to achieve this, or use a message queue.
    2. You can put the checking mechanism outside the jobserver code (in the database) to ensure that only one API call executes. Similar to what you suggested. If you use a postgres transaction, you don't have to worry about your job crashing and the value of the lock remaining set.

    The pros/cons of both approaches are straightforward. The major difference in my mind between 1 & 2, is that if you update the job server code, then you might have a situation where 2 job servers might be running at the same time. This would destroy the isolation property you want. Hence, database might work better, or be more idiomatic in the k8s sense (all servers are stateless so all the k8s goodies work; put any shared state in a database that can handle concurrency).

    Addressing your ideas, here are my thoughts:

    1. Find a setting in k8s that will limit this: k8s will not start things with the same name (in the metadata of the spec). But anything else goes for a job, and k8s will start another job.

    2. a) etcd3 supports distributed locking primitives. However, I've never used this and I don't really know what to watch out for.

    3. b) postgres lock value should work. Even in case of a job crash, you don't have to worry about the value of the lock remaining set.

    4. Querying k8s API server for things that should be atomic is not a good idea like you said. I've used a system that reacts to k8s events (like an annotation change on an object spec), but I've had bugs where my 'operator' suddenly stops getting k8s events and needs to be restarted, or again, if I want to push an update to the event-handler server, then there might be 2 event handlers that exist at the same time.

    I would recommend sticking with what you are best familiar with. In my case that would be implementing a job-server like k8s deployment that runs as a server and listens to events/API calls.