Search code examples
pythonasync-awaitgunicornbackground-taskfastapi

Need to send response status code right away with FastAPI while keeping jobs synchronously in the background


I have a very time consuming task (image processing) which receives some input data from a request given to a FastAPI endpoint. In order to keep the caller responsive I need to send an instant response message like "ok" along with a 201 status code (the latter optional).

So far I've been using this:

from fastapi import BackgroundTasks, FastAPI

app = FastAPI()

def main_process(parameters)
...some long task

@app.post('/task')
async def do_task(reference_id: int,
              bucket: str,
              document_url: str,
              return_url: str,
              background_tasks: BackgroundTasks):

    background_tasks.add_task(main_process, bucket, document_url, reference_id, return_url)
    return 'ok'

Each main_process task downloads an image from a bucket in S3 and then does some processing. The solution shown above works ok until it reaches like 10 images processed asynchronously (given async def) and then it crashes.

I've tried increasing some gunicorn parameters as well, like max-requests to 100, like this:

gunicorn api:app -b 0.0.0.0:8000 -w 4 -k uvicorn.workers.UvicornWorker --preload --max-requests 100 --daemon

Which gave me more room to process (20 more images), but it crashes anyway.

I've also considered using Celery or some distributed task queue solution, but I want to keep things as simple as possible.

Since async behaviour is not crucial, but instant response is, is it possible to switch to a synchronous solution but having an "ok" response right away?


Solution

  • No, you'll have to really dispatch the task and delegate it to some processing backend. Such backend can be quite simple, e.g. just a task queue (celery/amqp, redis, a relational database, whatever suits your needs) and at least one process consuming that queue, performing the calculation and feeding the result back into the storage.

    When you dispatch the request from your API, generate a UUID at the same time and store it alongside your calculation job in the queue. When you feed back your quick 200 OK to the caller, also provide them their job's UUID (if required). They'll hit your API again querying for a result; have them provide the UUID and use it to look for a result in your storage backend.

    To avoid calculating the same request twice, generate a hash from the request and use that instead of the UUID (watch for collision, you want some longer hashes). That works easily as long as you don't have to cope with user/image permissions.