I have a very long script that ingests a pdf, does a lot of processing, then returns a result. It runs perfectly when running through port 8000 through either
python manage.py runserver 0.0.0.0:8000
or
gunicorn --bind 0.0.0.0:8000 myproject.wsgi
However when I run it via port 80 in "production" the script stops running at a certain point with no errors and seemingly no holes in the logic. What's really causing confusion is that it stops in different places depending on the length/complexity of the processed document. Short/simple ones complete with no issue but a longer one will stop in the middle.
I tried adding a very detailed log file to debug the issue. If I process one document, it stops running in the same loop but at different places within the loop (seemingly random), indicating that this isn't a logical flaw (note I'm writing and flushing). Furthermore, if I use a longer/more complex document it mysteriously stops earlier in the process.
I'm deploying this using Django via gunicorn/nginx on DigitalOcean
Is there some sort of built in protection that stops processes after a certain number of CPU cycles or time as protection against infinite loops in any of the above? That's the only thing that I can think of because I'm otherwise out of ideas.
I'd really appreciate any help!
Figured it out. Gunicorn has a built in timer that kills workers after a set amount of time. The default (30 seconds per gunicorn's documentation) was too short for my process. To solve, add the "timeout" variable in "ExecStart" in the gunicorn configuration file; standard setup on Ubuntu 20.4:
sudo nano /etc/systemd/system/gunicorn.service
then add the timeout variable to the ExecStart (I used 120 seconds in this example):
ExecStart=/home/sammy/myprojectdir/myprojectenv/bin/gunicorn \
--access-logfile - \
--workers 3 \
--timeout 120 \
--bind unix:/run/gunicorn.sock \
myproject.wsgi:application
I determined this by looking at the "journalctl", which records the stdout. To view the most recent 50 lines of the stream, enter the following into your terminal:
journaltctl | tail -50
In my case, I noticed an entry containing "[CRITICAL] WORKER TIMEOUT (pid:xxxxxx)"