I hit strange issue when epoll_wait
is blocking for EPOLLOUT
event on unix domain socket in edge triggered mode.
Some details: I use boost ASIO
for IPC between two processes with file descriptors passing.
Here are some strace logs:
25097 16:59:04.273555 epoll_ctl(4, EPOLL_CTL_MOD, 37, {EPOLLIN|EPOLLPRI|EPOLLOUT|EPOLLERR|EPOLLHUP|EPOLLET, {u32=40872176, u64=40872176}}) = 0
25097 16:59:04.273588 epoll_wait(4, {{EPOLLOUT, {u32=40872176, u64=40872176}}}, 128, -1) = 1
25097 16:59:04.273617 sendmsg(37, {msg_name(0)=NULL, msg_iov(1)=[{data skipped, 247}], msg_controllen=24, {cmsg_len=24, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, {34, 49}}, msg_flags=0}, MSG_NOSIGNAL) = 247
25097 16:59:04.273671 epoll_ctl(4, EPOLL_CTL_DEL, 34, {0, {u32=0, u64=0}}) = 0
25097 16:59:04.273715 close(34) = 0
25097 16:59:04.273752 close(49) = 0
25097 16:59:04.273801 epoll_wait(4, {{EPOLLOUT, {u32=40872176, u64=40872176}}}, 128, -1) = 1
25097 16:59:04.273848 epoll_wait(4, <unfinished ...>
And I'm blocked in last epoll_wait
call.
My understanding is that as I'm using edge triggered mode (EPOLLET
), then I'm for sure can block if fd is already ready for write operations.
The question is: how to debug if unix domain socket is ready for write operations? /proc/net/unix
shows nothing interesting.
My understanding is that as I'm using edge triggered mode (
EPOLLET
), then I'm for sure can block if fd is already ready for write operations.
I agree.
The question is: how to debug if unix domain socket is ready for write operations?
If you have a kernel file with debugging symbols, you could do
gdb vmlinux /proc/kcore
and with the struct sock
address from the Num
column of /proc/net/unix
p ((struct sock *)0xaddress)->sk_wmem_alloc
- inspect the committed transmit queue bytes and other structure elements to see if the socket's send buffer has space left.
But actually you needn't do that because the strace output already shows in the next-to-last line the EPOLLOUT
event, and between that and the epoll_wait
in the last line there's no system call which could change the situation, i. e. no signal edge. I think it's unwise to wait edge-triggered here.