Search code examples
pythonkill-processapscheduler

Stop the running instances when max_instances is reached


I'm using apscheduler-django and I created a task that loops every 10 seconds.

This function will make a request to an API and save the content to my database (PostgreSQL).

This is my task:

scheduler.add_job(
  SaveAPI,
  trigger=CronTrigger(second="*/10"), 
  id="SaveAPI", 
  max_instances=1,
  replace_existing=True,
)

and my SaveAPI is:

def SaveAPI():
    SPORT = 3
    print('API Queue Started')
    AllMatches = GetAllMatches(SPORT)
    for Match in AllMatches:
        AddToDatabase(Match, SPORT)
    print(f'API Queue Ended')

The GetAllMatches and AddToDatabase are too big and I don't think the implementations are relevant to my question.

My problem is sometimes I will get this error:

Run time of job "SaveAPI (trigger: cron[second='*/10'], next run at: 2022-03-05 23:21:00 +0330)" was missed by 0:00:11.445357

When this happens, it will not get replaced with a new instance because my SaveAPI function doesn't end. And apscheduler will always miss new instances.

I did many tests and function does not have any problem.

How can I make apscheduler stop the last running instance if a new instance is going to be missed?

So if my last instance takes more than 10 seconds, I want to just terminate the instance and create a new one.


Solution

  • apscheduler and apscheduler-django don't directly support that.

    You can implement and use a custom executor that tracks the process running a job and kills the process if trying to submit a job that is currently running.

    Here's a MaxInstancesCancelEarliestProcessPoolExecutor that uses pebble.ProcessPool.

    class MaxInstancesCancelEarliestProcessPoolExecutor(BasePoolExecutor):
        def __init__(self):
            pool = ProcessPool()
            pool.submit = lambda function, *args: pool.schedule(function, args=args)
            super().__init__(pool)
            self._futures = defaultdict(list)
    
        def submit_job(self, job, run_times):
            assert self._lock is not None, 'This executor has not been started yet'
            with self._lock:
                if self._instances[job.id] >= job.max_instances:
                    f = self._futures[job.id][0]                      # +
                    f.cancel()                                        # +
                    try:                                              # +
                        self._pool._pool_manager.update_status()      # +
                    except RuntimeError:                              # +
                        pass                                          # +
                    if self._instances[job.id] >= job.max_instances:  # +
                        raise MaxInstancesReachedError(job)
    
                self._do_submit_job(job, run_times)
                self._instances[job.id] += 1
    
        def _do_submit_job(self, job, run_times):
            def callback(f):
                with self._lock:                        # +
                    self._futures[job.id].remove(f)     # +
                    try:                                # +
                        exc, tb = (f.exception_info() if hasattr(f, 'exception_info') else
                                   (f.exception(), getattr(f.exception(), '__traceback__', None)))
                    except CancelledError:              # +
                        exc, tb = TimeoutError(), None  # +
                    if exc:
                        self._run_job_error(job.id, exc, tb)
                    else:
                        self._run_job_success(job.id, f.result())
    
            try:
                f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name)
            except BrokenProcessPool:
                self._logger.warning('Process pool is broken; replacing pool with a fresh instance')
                self._pool = self._pool.__class__(self._pool._max_workers)
                f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name)
    
            f.add_done_callback(callback)
            self._futures[job.id].append(f)  # +
    
        def shutdown(self, wait=True):
            if wait:
                self._pool.close()
                self._pool.join()
            else:
                self._pool.close()
                threading.Thread(target=self._pool.join).start()
    

    Usage:

    scheduler.add_executor(MaxInstancesCancelEarliestProcessPoolExecutor(), alias='max_instances_cancel_earliest')
    
    scheduler.add_job(
        SaveAPI,
        trigger=CronTrigger(second="*/10"),
        id="SaveAPI",
        max_instances=1,
        executor='max_instances_cancel_earliest',  # +
        replace_existing=True,
    )