Why does uWSGI not reject requests when the listen queue should be full?

Given the following minimal example:

# myproject.py

import time

from flask import Flask

app = Flask(__name__)


@app.route('/')
def hello():
    time.sleep(5)
    return 'hello'


if __name__ == '__main__':
    app.run(host='0.0.0.0')

# wsgi.py

from myproject import app

if __name__ == '__main__':
    app.run()

uwsgi --http-socket 0.0.0.0:8080 --workers 1 --listen 2 --module wsgi:app

I now expect that sending more than 3 requests (1 ongoing in the worker, 2 queued) at the same time would result in only 3 being served, and the others being rejected.

However, this seems to not be the case. When sending 10 requests like so

curl http://127.0.0.1:8080 & \
curl http://127.0.0.1:8080 & \
curl http://127.0.0.1:8080 & \
curl http://127.0.0.1:8080 & \
curl http://127.0.0.1:8080 & \
curl http://127.0.0.1:8080 & \
curl http://127.0.0.1:8080 & \
curl http://127.0.0.1:8080 & \
curl http://127.0.0.1:8080 & \
curl http://127.0.0.1:8080

all are successfully served one-by-one. Why is this the case? Do I misunderstand/misconfigure something?

(I'm using Ubuntu 20.04, in case this is of any importance.)

Solution

I am not sure, low level networking is not my area of expertise, but I believe I have found the answer.

I found a question from a number of years ago very similar to yours. Someone was seeing uwsgi queuing more responses than the specified listen value should allow. https://uwsgi.unbit.narkive.com/QKdRyejv/when-the-backlog-is-full-is-uwsgi-supposed-to-accept-connections

Near the bottom of the page we see this:

Yes, this is the expected behaviour for linux, the minimal size of the listen queue is always forced to 8 (in BSD to 5).

To confirm this is actually correct, I did some more digging, leading me to find that the listen value is actually only a hint for how many to wait for, and implementations may listen for a different amount.

https://pubs.opengroup.org/onlinepubs/9699919799/functions/listen.html

The backlog argument provides a hint to the implementation which the implementation shall use to limit the number of outstanding connections in the socket's listen queue. Implementations may impose a limit on backlog and silently reduce the specified value. Normally, a larger backlog argument value shall result in a larger or equal length of the listen queue. Implementations shall support values of backlog up to SOMAXCONN, defined in <sys/socket.h>.

listen() ignores the backlog argument? lead me to check the actual linux kernel source code to confirm the original claim.

http://lxr.linux.no/#linux+v2.6.36/net/core/request_sock.c#L44 we see what seems to be a confirmation

43        nr_table_entries = min_t(u32, nr_table_entries, sysctl_max_syn_backlog);
44        nr_table_entries = max_t(u32, nr_table_entries, 8);
45        nr_table_entries = roundup_pow_of_two(nr_table_entries + 1);

This seems to first use either nr_table_entries or sysctl_max_syn_backlog (constant 256), which ever is smaller. In our example nr_table_entries should be 2.

Next it picks a max of that and 8, so our 2 gets thrown away and 8 is used.

Then it rounds up to the next highest power of 2.

I flooded your example server with more traffic (100 concurrent requests), and all but 9 failed. I'm fairly convinced that this explains the behavior you're seeing. You can't actually have a listen value that low.