Should I assert fail on select() EBADF?

I am trying to fix a bug in a event loop that calls select(). When select() returns EBADF, an error is logged, then the fd set is re-initialized and select is called again. This results in an infinite hard loop of logging, generating gigabytes of log in a matter of seconds.

This error occurs if one of the tcp servers my program is connected to does an unclean disconnect (eg it segfaults). In this case I would ideally want my program to remove that fd and keep running (or shut down if that isn't feasible).

My question is, should select() ever be returning EBADF, or is that an indication that my program is buggy? I.e. should I assert fail on EBADF, or otherwise, how should I be handling it? Would I loop through the fd set to find the "bad" file descriptor?

Solution

You have a bug in your code. Fix it. Somewhere you are closing a socket without removing it from the FD set used by the selector. Or else you have just made up an FD that isn't an FD and are using it in the FD set.

Contrary to other statements here, network problems cannot cause this error. Network outages do not close sockets, which is the only way they can become invalid. Only closing them does that. A socket whose connection isn't working will eventually cause an ECONNRESET if you keep writing to it. A socket whose peer has disconnected will become readable and a recv() on it will return zero.