I am running a Python Flask app behind Gunicorn on an Ubuntu VM. The Ubuntu VM is hosted in Azure, and I am using a cloud-init script to install the app and launch Gunicorn upon VM instantiation.
Gunicorn launches with 8 workers (recommended for a VM with 4 vCPUs). However, immediately after VM initialization, my VM throughput is limited to about 100 requests per second.
If I kill the 8 Gunicorn workers that were launched by cloud-init and manually start Gunicorn myself as superuser (again 8 workers), then throughput jumps up to about 900 requests per second.
I am not able to tell any difference between Gunicorn processes launched by cloud-init and Gunicorn processes launched by superuser, except that they show different behavior under load.
Here is a screenshot of top
when the VM is freshly initialized and under stress:
Here is a screenshot of top
after I have killed the Gunicorn workers and restarted them as superuser:
You can see that for the cloud-init-spawned workers, only a few processes appear to be getting any load, while load is evenly distributed across the superuser workers.
Below I will compare the output of ps
for the cloud-init and superuser workers.
cloud-init:
superuser:
The output from ps
shows that the cloud-init workers are indeed distributed across all 4 vCPUs. I am wondering then why they behave as if only a few of them are getting traffic.
Here is the content of my cloud-init.txt:
#cloud-config
package_upgrade: true
package_update: true
packages:
- python3-pip
runcmd:
- sudo -H pip3 install -U pipenv
- cd /home/azureuser
- git clone https://github.com/[user]/[repo].git
- cd /home/azureuser/serve-stateful
- pipenv install
- pipenv run gunicorn -w 8 --bind "$(hostname -I):8034" gunicorn_server:app
Fixed the problem by adding daemon, user, and group flags to my gunicorn launch command:
pipenv run gunicorn -w 8 --bind "$(hostname -I):8034" gunicorn_server:app --user root --group root --daemon
Not 100% sure on the details, but I think the issue is that gunicorn workers keep stdio inheritance when not started in daemon mode: http://docs.gunicorn.org/en/stable/settings.html#enable-stdio-inheritance All that writing to stdio could be throttling performance.
Interesting (and exciting!) to note that now requests per second has gone up further to 1368 per second, since even the superuser workers I was spawning in my question above were writing out to stdio.