Search code examples
linuxmultithreadingioepollepollet

multithreaded epoll server: wake up N threads sleeping on the same epoll fd


I have a multithreaded epoll server. I create an epoll fd, then I will have X threads sleeping, waiting with epoll_wait() any event from that SAME epoll fd.

Now my question is this: how can I wakeup N threads, with N > 1 && N < X?

Until now, I've used the Linux specific eventfd facility, and it worked pretty well with only 1 thread, but now with multiple threads waiting for the SAME epoll fd, a problem arises:

case 1) LT: If I add my eventfd with the "level triggered" mode, ALL threads will wake up when I write to the eventfd, this is just how level triggered mode works: once a fd changes state let's wake up all threads.

N = X

case 2) ET: If i add my eventfd with the "edge triggered" mode, ONLY 1 thread will wake up when I write to the eventfd, and this is just how edge triggered mode works: no more epollfd events until I receive EAGAIN from read(eventfd, ...);.

N = 1

case 3) I've also tried with a self-pipe trick, and writing N times to the pipe would wake up N threads. Instead it won't work: its not reliable, sometimes one threads reads 2 "tokens" from the pipe, sometimes 1, or 3.

N = RANDOM

In all cases I've tried, I can't get just N=N, I'm not able to wake up only N threads, but either 1 or ALL, or RANDOM. What am I missing? Any thoughts? NOTE: I've also tried the eventfd specific EFD_SEMAPHORE flag, without any help from there.


Solution

  • According to eventfd manual page.


    The file descriptor is readable (the select(2) readfds argument; the poll(2) POLLIN flag) if the counter has a value greater than 0.

    By creating an eventfd with EFD_SEMAPHORE flag:

    (if) the eventfd counter has a nonzero value, then a read(2) returns 8 bytes containing the value 1, and the counter's value is decremented by 1.


    Use a semaphored (EFD_SEMAPHORE flag), NONBLOCK (EFD_NONBLOCK flag) eventfd and wait with a level-triggered epoll(), or a normal poll().

    With eventfd_write(fd, N) you write the N threads you want to wake up.

    When a thread wakes up you perform a read(). If you get EAGAIN error, you may go back to sleep because N successful reads have been done and therefore N threads know they have to keep awake.

    Disadvantages

    All threads wake up (the thundering herd problem).