Given the following minimal example:
# myproject.py
import time
from flask import Flask
app = Flask(__name__)
@app.route('/')
def hello():
time.sleep(5)
return 'hello'
if __name__ == '__main__':
app.run(host='0.0.0.0')
# wsgi.py
from myproject import app
if __name__ == '__main__':
app.run()
uwsgi --http-socket 0.0.0.0:8080 --workers 1 --listen 2 --module wsgi:app
I now expect that sending more than 3 requests (1 ongoing in the worker, 2 queued) at the same time would result in only 3 being served, and the others being rejected.
However, this seems to not be the case. When sending 10 requests like so
curl http://127.0.0.1:8080 & \
curl http://127.0.0.1:8080 & \
curl http://127.0.0.1:8080 & \
curl http://127.0.0.1:8080 & \
curl http://127.0.0.1:8080 & \
curl http://127.0.0.1:8080 & \
curl http://127.0.0.1:8080 & \
curl http://127.0.0.1:8080 & \
curl http://127.0.0.1:8080 & \
curl http://127.0.0.1:8080
all are successfully served one-by-one. Why is this the case? Do I misunderstand/misconfigure something?
(I'm using Ubuntu 20.04, in case this is of any importance.)
I am not sure, low level networking is not my area of expertise, but I believe I have found the answer.
I found a question from a number of years ago very similar to yours. Someone was seeing uwsgi queuing more responses than the specified listen
value should allow.
https://uwsgi.unbit.narkive.com/QKdRyejv/when-the-backlog-is-full-is-uwsgi-supposed-to-accept-connections
Near the bottom of the page we see this:
Yes, this is the expected behaviour for linux, the minimal size of the listen queue is always forced to 8 (in BSD to 5).
To confirm this is actually correct, I did some more digging, leading me to find that the listen value is actually only a hint for how many to wait for, and implementations may listen for a different amount.
https://pubs.opengroup.org/onlinepubs/9699919799/functions/listen.html
The backlog argument provides a hint to the implementation which the implementation shall use to limit the number of outstanding connections in the socket's listen queue. Implementations may impose a limit on backlog and silently reduce the specified value. Normally, a larger backlog argument value shall result in a larger or equal length of the listen queue. Implementations shall support values of backlog up to SOMAXCONN, defined in <sys/socket.h>.
listen() ignores the backlog argument? lead me to check the actual linux kernel source code to confirm the original claim.
http://lxr.linux.no/#linux+v2.6.36/net/core/request_sock.c#L44 we see what seems to be a confirmation
43 nr_table_entries = min_t(u32, nr_table_entries, sysctl_max_syn_backlog);
44 nr_table_entries = max_t(u32, nr_table_entries, 8);
45 nr_table_entries = roundup_pow_of_two(nr_table_entries + 1);
This seems to first use either nr_table_entries
or sysctl_max_syn_backlog
(constant 256), which ever is smaller. In our example nr_table_entries
should be 2.
Next it picks a max of that and 8, so our 2 gets thrown away and 8 is used.
Then it rounds up to the next highest power of 2.
I flooded your example server with more traffic (100 concurrent requests), and all but 9 failed. I'm fairly convinced that this explains the behavior you're seeing. You can't actually have a listen value that low.