I have a Flask app that I have been unable to scale past 125 RPS locally. It is a simple 'hello world' as seen below.
I'm using the Locust.io load testing tool. I have pointed the same load test to a local Golang hello world, and am able to get into 1000's of RPS. IMHO this rules out my Locust and OS configurations as potential bottlenecks.
I'm using 17 workers as my machine has 8 cores ((2*CPU)+1
is recommended by Gunicorn docs)
From what I've read, using the gevent
worker type for Gunicorn should allow me to reach 1000's of RPS, just like with Golang. Is this a correct assumption? or am I missing something critical?
abbreviated code:
app = Flask(__name__)
@app.route('/')
def hello():
return 'hello world!'
Gunicorn conf:
gunicorn -k gevent -w 17 --worker-connections 100000 app:app
Answer from authors here: https://github.com/benoitc/gunicorn/issues/305
After another week of debugging, I figured it out! Turns out there is an additional worker type, gevent_pywsgi
. Using this worker type increased the throughout roughly 10x, to levels I would consider acceptable.
My testing showed no difference in performance between the sync
worker and gevent
worker, so I’m still not sure what’s going on there, or what the intent of the gevent
worker type is.