linux multithreading sockets scalability

Does the Thundering Herd Problem exist on Linux anymore?

Many Linux/Unix programming books and tutorials speak about the "Thundering Herd Problem" which happens when multiple threads or forks are blocked on a select() call waiting for readability of a listening socket. When the connection comes in, all threads and forks are woken up but only one "wins" with a successful call to accept(). In the meantime, a lot of CPU time is wasted waking up all the threads/forks for no reason.

I noticed a project which provides a "fix" for this problem in the linux kernel, but this is a very old patch.

I think there are two variants; One where each fork does select() and then accept(), and one that just does accept().

Do modern Unix/Linux kernels still have the Thundering Herd Problem in both these cases or only the "select() then accept()" version?

Solution

It's there and it's real. See this issue that we are seeing in uwsgi: https://github.com/unbit/uwsgi/issues/2611

If I disable the --thunder-lock option in uwsgi, that means uwsgi won't be using right api/locking mechanism of system. In that case during my peak load I could see lot of context switch and lot of time wasted. Consistent high response time of my application. (I am talking 1 Lac request per min on my server) at this moment.