Search code examples
linuxmultithreadingsocketsscalability

Does the Thundering Herd Problem exist on Linux anymore?


Many Linux/Unix programming books and tutorials speak about the "Thundering Herd Problem" which happens when multiple threads or forks are blocked on a select() call waiting for readability of a listening socket. When the connection comes in, all threads and forks are woken up but only one "wins" with a successful call to accept(). In the meantime, a lot of CPU time is wasted waking up all the threads/forks for no reason.

I noticed a project which provides a "fix" for this problem in the linux kernel, but this is a very old patch.

I think there are two variants; One where each fork does select() and then accept(), and one that just does accept().

Do modern Unix/Linux kernels still have the Thundering Herd Problem in both these cases or only the "select() then accept()" version?


Solution

  • It's there and it's real. See this issue that we are seeing in uwsgi: https://github.com/unbit/uwsgi/issues/2611

    If I disable the --thunder-lock option in uwsgi, that means uwsgi won't be using right api/locking mechanism of system. In that case during my peak load I could see lot of context switch and lot of time wasted. Consistent high response time of my application. (I am talking 1 Lac request per min on my server) at this moment.