uWSGI resets worker on lifetime reached causes downtime

I have uWSGI running Django. I configured my uWSGI workers to have max-worker-lifetime=3600.

What I see is that every 3600 seconds, both uWSGI workers terminate together, and this causes downtime of about 10-15 seconds, with all to the server failing with 502 from the remote server.

These are the logs:

Everything normal

00:00:00 worker 1 lifetime reached, it was running for 3601 second(s)
00:00:00 worker 2 lifetime reached, it was running for 3601 second(s)

00:00:01 HTTP 502 (on the remote server)
00:00:01 HTTP 502 (on the remote server)
...
00:00:11 HTTP 502 (on the remote server)
00:00:11 HTTP 502 (on the remote server)

00:00:11 Respawned uWSGI worker 1 (new pid: 66)
00:00:11 Respawned uWSGI worker 2 (new pid: 67)

Everything goes back to normal

I'm not sure why it has this behavior.

This is the configuration used (only relevant parts):

    UWSGI_MASTER=true \
    UWSGI_WORKERS=2 \
    UWSGI_THREADS=8 \
    UWSGI_LISTEN=250 \
    UWSGI_LAZY_APPS=true \
    UWSGI_WSGI_ENV_BEHAVIOR=holy \
    UWSGI_MAX_WORKER_LIFETIME=3600 \
    UWSGI_RELOAD_ON_RSS=1024 \
    UWSGI_SINGLE_INTERPRETER=true \
    UWSGI_VACUUM=true

Solution

You should use a parameter that has been recently added through a community contribution max-worker-lifetime-delta

https://github.com/unbit/uwsgi/issues/2020

https://github.com/unbit/uwsgi/pull/2021

This will make sure your workers don't restart so close. Also since your workers have a heavy loading time, have enough workers so that worker restart doesn't have a large impact.

Also you are using a worker timeout which looks quite low, you should experiment with that too and increase it to a value which you find is better