Search code examples
c++httpsocketsposix-select

Why does select's behaviour differ when trying to read and write sockets?


Let's say we have a client file descriptor accepted with accept()

client_socket = accept(_socket, (sockaddr *)&client_addr, &len)

We now set this file descriptor in a read and write fd_set:

fd_set readfds;
fd_set writefds;

//zero them
FD_ZERO(readfds);
FD_ZERO(writefds);

//set the client_socket
FD_SET(client_socket, &readfds);
FD_SET(client_socket, &writefds);

now we use select the to check if the socket is readable or writable:

select(FD_SETSIZE, &readfs, &writefds, NULL, NULL)

We now check if we can read first and read all bytes from it.

if (FD_ISSET(client_socket, &readfds) {
    read(client_socket, &buf, 4096);
}
//assume that buf is big enough and that read returns less than 4096

In the next loop we reset the fd_sets just like before. Now select will allow us to write our response to the client:

if (FD_ISSET(client_socket, &readfds) {
    write(client_socket, &buf, len(buf));
}

Till here everything works fine but now the weird behaviour occurs. Let's assume that our client told us to keep the connection alive, in that case we would set the fd_set the same way as before like this:

//zero them
FD_ZERO(readfds);
FD_ZERO(writefds);

//set the client_socket
FD_SET(client_socket, &readfds);
FD_SET(client_socket, &writefds);
// reading not allowed

When using select now it will allow us to write again but it won't allow to read from the client_socket. BUT if we change the setting of the writefds to zero it will allows us to read, although we did not change anything in the readfds.

//zero them
FD_ZERO(readfds);
FD_ZERO(writefds);

//set the client_socket
FD_SET(client_socket, &readfds);
//FD_SET(client_socket, &writefds); -> don't set the file-descriptors for write
// now reading is allowed

Can someone explain me if this is the correct behaviour of select or if this is my fault maybe in other parts of the code that I didn't show (way too complex). For me it seems like the behaviour of select is random when setting both sets (writing and reading). I know that there is a way around this by keeping a kind of state to decide if we want to set the reading-file-descriptors or the writing-file-descriptors, but I was hoping for a cleaner solution.


Solution

  • The purpose of select() is to not return until there is something for your program to do. That way your program can sleep inside select() until I/O is ready, wake up immediately to do the I/O, and then go back to sleep as quickly as possible afterwards.

    So the question is, how does select() know when to return? The answer is, you have to tell it what should cause it to return, by calling FD_SET() in the appropriate ways.

    Usually you will want select() to return when data is ready-for-read on any of your sockets (so you can read the newly-arrived data), so you should usually call FD_SET(mySock, &readFD) on all of your sockets.

    FD_SET(mySock, &writeFD) is a bit more nuanced. It tells select() to return when the socket has buffer-space available to write output bytes to. However, in many cases you don't want select() to return when a socket has buffer-space available, simply because you don't currently have any data that you want to write to the socket anyway. In that scenario, if you always call FD_SET(mySocket, &writeFD) then select() will keep returning immediately even though you don't have any task you want to perform, and that will cause your program to use up lots of CPU cycles for no good purpose.

    So the only time you should call FD_SET(mySocket, &writeFD) is if you know that you want to write some data to that socket ASAP.

    In your program, what is likely happening is that FD_SET(mySocket, &writeFD) is causing select() to return immediately (because mySocket currently has buffer-space available to write to), and then your program is (mistakenly) assuming that because select() has returned, the socket is ready-for-read, only to find out that it isn't. In the case where you've commented out the FD_SET(mySocket, &writeFD), OTOH, select() doesn't return until the socket is ready-for-read, and so you get the behavior you expect when you call FD_ISSET(mySocket, &readFD).