Search code examples
pythondockernginxflaskgunicorn

Gunicorn: worker timeout and worker exiting on all containers


A little background on my issue:
I have the following Gunicorn config file: gunicorn_cofig.py

pidfile = 'app.pid'
worker_tmp_dir = '/dev/shm'
worker_class = 'gthread'
workers = 1
worker_connections = 1000
timeout = 30
keepalive = 1
threads = 2
proc_name = 'app'
bind = '0.0.0.0:8080'
backlog = 2048
accesslog = '-'
errorlog = '-'

Initially, I was running a single Flask app and Nginx in Docker and Gunicorn never had an issue. Later I added another app and configured my Nginx to redirect requests from different subdomain to different port (ex. mydomain.com -> port 81 and app.mydomain.com -> port 5000) (If I understand correctly since I bind all subdomains to the same IP (I use LetsEncrypt) request from all subdomains come to port 443 then Nginx reverse proxies them to respective ports). For that I just copy pasted the same gunicorn_cofig.py in the new app and it was working fine.

Now I added 6 apps and I am using the same config file in all of them and I am getting the following error:

app1           [2022-01-13 05:30:26 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:9)
app1          | [2022-01-13 05:30:26 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:10)
app1          | [2022-01-13 05:30:26 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:11)
app1          | [2022-01-13 05:30:26 +0000] [11] [INFO] Worker exiting (pid: 11)
app1          | [2022-01-13 05:30:26 +0000] [9] [INFO] Worker exiting (pid: 9)
app1          | [2022-01-13 05:30:26 +0000] [10] [INFO] Worker exiting (pid: 10)
app1          | [2022-01-13 05:30:27 +0000] [1] [WARNING] Worker with pid 9 was terminated due to signal 9
app1          | [2022-01-13 05:30:27 +0000] [1] [WARNING] Worker with pid 11 was terminated due to signal 9
app1          | [2022-01-13 05:30:27 +0000] [12] [INFO] Booting worker with pid: 12
app1          | [2022-01-13 05:30:27 +0000] [1] [WARNING] Worker with pid 10 was terminated due to signal 9
app1         | [2022-01-13 05:30:27 +0000] [13] [INFO] Booting worker with pid: 13
app2         | [2022-01-13 05:30:27 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:9)
app2         | [2022-01-13 05:30:27 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:10)
app2         | [2022-01-13 05:30:27 +0000] [10] [INFO] Worker exiting (pid: 10)
app2         | [2022-01-13 05:30:27 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:11)
app2         | [2022-01-13 05:30:27 +0000] [9] [INFO] Worker exiting (pid: 9)
app1         | [2022-01-13 05:30:27 +0000] [14] [INFO] Booting worker with pid: 14

and so on, for every app.

I tried to change proc_name = 'app' to different values but the same thing is happening. Here's a snippet of my docker compose file where I run the command for Gunicorn:

m:
    build:
      dockerfile: ./m/Dockerfile 
      context: ../../
    command: gunicorn --bind 0.0.0.0:82 --workers 3 ${M_FLASK_APP}:app
    environment:
      - FLASK_ENV=${FLASK_ENV}
      - PYTHONUNBUFFERED=1
    ports:
      - 82:82
    volumes:
      - ./m/src:/opt/m
  
 p:
    build:
      dockerfile: ./p/Dockerfile 
      context: ../../
    command: gunicorn --bind 0.0.0.0:83 --workers 3 ${P_FLASK_APP}:app
    environment:
      - FLASK_ENV=${FLASK_ENV}
      - PYTHONUNBUFFERED=1
    ports:
      - 83:83
    volumes:
      - ./p/src:/opt/p

Can somebody guide me to the right direction. Is it gunicorn configuration problem or docker (since I am running six apps and 1 nginx server)?


Solution

  • I got this issue fixed by upgrading my Compute Engine instance (I'm using Google Cloud). Previously I had 2 CPU cores with 4GB RAM and changed it to 4 CPU cores and 4GB RAM with CPU optimized machine. Now I can host more than 6 apps on the same instance but the cost is much higher. Apparently, the reason for the above error was in hardware limitations.