I have uWSGI running Django. I configured my uWSGI workers to have max-worker-lifetime=3600.
What I see is that every 3600 seconds, both uWSGI workers terminate together, and this causes downtime of about 10-15 seconds, with all to the server failing with 502 from the remote server.
These are the logs:
Everything normal
00:00:00 worker 1 lifetime reached, it was running for 3601 second(s)
00:00:00 worker 2 lifetime reached, it was running for 3601 second(s)
00:00:01 HTTP 502 (on the remote server)
00:00:01 HTTP 502 (on the remote server)
...
00:00:11 HTTP 502 (on the remote server)
00:00:11 HTTP 502 (on the remote server)
00:00:11 Respawned uWSGI worker 1 (new pid: 66)
00:00:11 Respawned uWSGI worker 2 (new pid: 67)
Everything goes back to normal
I'm not sure why it has this behavior.
This is the configuration used (only relevant parts):
UWSGI_MASTER=true \
UWSGI_WORKERS=2 \
UWSGI_THREADS=8 \
UWSGI_LISTEN=250 \
UWSGI_LAZY_APPS=true \
UWSGI_WSGI_ENV_BEHAVIOR=holy \
UWSGI_MAX_WORKER_LIFETIME=3600 \
UWSGI_RELOAD_ON_RSS=1024 \
UWSGI_SINGLE_INTERPRETER=true \
UWSGI_VACUUM=true
You should use a parameter that has been recently added through a community contribution max-worker-lifetime-delta
https://github.com/unbit/uwsgi/issues/2020
https://github.com/unbit/uwsgi/pull/2021
This will make sure your workers don't restart so close. Also since your workers have a heavy loading time, have enough workers so that worker restart doesn't have a large impact.
Also you are using a worker timeout which looks quite low, you should experiment with that too and increase it to a value which you find is better