django django-rest-framework amazon-ecs gunicorn

Django/Gunicorn Request Takes Long Time To Begin Processing View

Looking into optimizing my application's request times and have done a lot of work to implement caching, reduce duplicate db calls, etc. However looking at our monitoring tools I see sometimes it looks like the request takes an exceptionally long time to even begin processing the view. Wondering if there is a way to explain this? Its making it hard to set a consistent SLO for API requests.

My understanding of gunicorn workers and threads is admittedly limited but I don't believe we would be hitting any limits on our current setup. That is really the only place I can imagine the bottleneck on proccessing the request would be ex: no more threads or workers available to process.

Django = 3.2.15
Django Rest Framework = 3.13.1
gunicorn = 20.0.4
DB Postgres using RDS

Start Command

      "gunicorn",
      "--workers=4",
      "--threads=8",
      "--bind=0.0.0.0:8000",
      "--worker-class=uvicorn.workers.UvicornWorker",
      "webapp.asgi:application"

Cache Configuration

CACHE_MIDDLEWARE_ALIAS = 'default'
CACHE_MIDDLEWARE_SECONDS = 60
CACHE_MIDDLEWARE_KEY_PREFIX = ''

CACHES = {
    "default": {
        "BACKEND": "django_redis.cache.RedisCache",
        "LOCATION": f"{REDIS_CONN_STRING}/0",
        "OPTIONS": {
            "CLIENT_CLASS": "django_redis.client.DefaultClient",
        }
    }
}

CACHEOPS_REDIS = f"{REDIS_CONN_STRING}/0"

CACHEOPS = {
    # Disable Op for User/Auth
    'auth.*': None,
    'users.*': None,
    'rest_framework.authtoken.models.token': None,

    '*.*': {'ops': (), 'timeout': 60},
}

This is running on ECS load balanced between 2 c6g.xlarge instances (4 vCPUs).

Elasticache instance: cache.t4g.medium Avg Memory usage: 400mb

Solution

I would suggest using the power of threads that python offers.

So instead of running every task on your main thread, hand over the tasks to sub-threads of the main thread and that should leave your main line always ready to accept more requests as long as it does not bottleneck with your computing power.

Also, running more concurrent requests does not imply higher RPS, though I am sure you are well aware of how much the average request time handled by your application is.

you should keep an eye by using a trace tool on how much time a request takes to get processed in low ~ normal traffic and take your decision upon this data

Take into consideration the location of your VM, if it is too far from your users that might be a reasonable reason why requests take that much time.