Search code examples
linuxepollunix-socket

edge triggered epoll for unix domain socket


I hit strange issue when epoll_wait is blocking for EPOLLOUT event on unix domain socket in edge triggered mode.

Some details: I use boost ASIO for IPC between two processes with file descriptors passing.

Here are some strace logs:

25097 16:59:04.273555 epoll_ctl(4, EPOLL_CTL_MOD, 37, {EPOLLIN|EPOLLPRI|EPOLLOUT|EPOLLERR|EPOLLHUP|EPOLLET, {u32=40872176, u64=40872176}}) = 0
25097 16:59:04.273588 epoll_wait(4, {{EPOLLOUT, {u32=40872176, u64=40872176}}}, 128, -1) = 1
25097 16:59:04.273617 sendmsg(37, {msg_name(0)=NULL, msg_iov(1)=[{data skipped, 247}], msg_controllen=24, {cmsg_len=24, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, {34, 49}}, msg_flags=0}, MSG_NOSIGNAL) = 247
25097 16:59:04.273671 epoll_ctl(4, EPOLL_CTL_DEL, 34, {0, {u32=0, u64=0}}) = 0
25097 16:59:04.273715 close(34)         = 0
25097 16:59:04.273752 close(49)         = 0
25097 16:59:04.273801 epoll_wait(4, {{EPOLLOUT, {u32=40872176, u64=40872176}}}, 128, -1) = 1
25097 16:59:04.273848 epoll_wait(4,  <unfinished ...>

And I'm blocked in last epoll_wait call. My understanding is that as I'm using edge triggered mode (EPOLLET), then I'm for sure can block if fd is already ready for write operations.

The question is: how to debug if unix domain socket is ready for write operations? /proc/net/unix shows nothing interesting.


Solution

  • My understanding is that as I'm using edge triggered mode (EPOLLET), then I'm for sure can block if fd is already ready for write operations.

    I agree.

    The question is: how to debug if unix domain socket is ready for write operations?

    If you have a kernel file with debugging symbols, you could do

    gdb vmlinux /proc/kcore
    

    and with the struct sock address from the Num column of /proc/net/unix

    p ((struct sock *)0xaddress)->sk_wmem_alloc

    - inspect the committed transmit queue bytes and other structure elements to see if the socket's send buffer has space left.

    But actually you needn't do that because the strace output already shows in the next-to-last line the EPOLLOUT event, and between that and the epoll_wait in the last line there's no system call which could change the situation, i. e. no signal edge. I think it's unwise to wait edge-triggered here.