python docker docker-compose fastapi google-cloud-run

Initialise fastapi web service fully before listening on port

I have a fastapi web app which is running inside of a docker container. The web app hosts multiple machine learning models which on startup it needs to read into memory. This process can take around 30 seconds.

What I am trying to achieve is to have the container startup and load all the models, classes etc before listening to any requests. This is useful because on platforms such as Google Cloud Run, you have a max of 4 minutes of startup before your container must start listening to traffic. Right now my container will drop any traffic whilst it is in this startup mode.

I was wondering if there was a way to achieve this either with some docker or fastapi magic!

Solution

Yes, this is possible with FastAPI magic! In both solutions presented, FastAPI will not start serving requests until the startup logic (in your case, loading a ML model) completes.

In practice, the app won't start listening on the selected port until the startup logic is complete and since you're running your app on the Google Cloud Run platform, it won't serve traffic to your app until it then.

If you're using FastAPI version < 0.93.0:

You should load your model in the "startup event handler". Example:

from fastapi import FastAPI


def fake_answer_to_everything_ml_model(x: float):
    return True


ml_models = {}

app = FastAPI()

@app.on_event("startup")
async def app_startup():
    # Load your ML model:
    ml_models["answer_to_everything"] = fake_answer_to_everything_ml_model

If you're using FastAPI version >= 0.93.0:

You should define your startup logic using the lifespan parameter of the FastAPI app. More comprehensive documentation can be found here, I'll just show a minimal example:

from contextlib import asynccontextmanager

from fastapi import FastAPI


def fake_answer_to_everything_ml_model(x: float):
    return True


ml_models = {}


@asynccontextmanager
async def lifespan(app: FastAPI):
    # Load your ML model
    ml_models["answer_to_everything"] = fake_answer_to_everything_ml_model
    yield
    # If you want you can define your shutdown logic after the `yield` keyword.


app = FastAPI(lifespan=lifespan)