Search code examples
csocketsselect-function

FD_SETSIZE versus calculated value


In a server/client setup I have a server connecting with clients on a handful (at the moment 4) of different sockets. At the moment I use select with a calculated set_size, but what are the upper limit, before it is worthwhile to use FD_SETSIZE instead?

Below are some code examples to illustrate the point. First building the set:

FD_ZERO(&set);
FD_SET(socket1, &set);
FD_SET(socket2, &set);
FD_SET(socket3, &set);
FD_SET(socket4, &set);

Here is how the set_size is calculated:

set_size = MAX(socket1, socket2);
set_size = MAX(set_size, socket 3);
set_size = MAX(set_size, socket4);
set_size += 1;

And the usage:

while ((cnt = select(set_size, &set, NULL, NULL, &t)) != -1 || errno == EINTR) {
    if (cnt > 0)
        // Do different stuff depending what socket is active
    else
        // Keep everything alive and add the sockets to the set again
}

Recently I had to add a two new sockets and I might need to add more in the future. When would you use FD_SETSIZE as opposed to the calculated set_size?


Solution

  • I have never worried about this because it seems like it would make a very small difference compared to the performance penalty for using select() in the first place.

    Having said that, I think it's always worth calculating the correct value, because it's not very expensive to calculate: if you keep the current set_size in a local variable as you propose, it's O(1) with a very low constant each time you add an fd (namely, a compare and possibly an update). Removing an fd is also O(1) except it it's the last one in the list (in which case it's O(set_size) but usually better). On the other hand, NOT calculating the set_size means that the kernel has to traverse all FD_SETSIZE entries every single time you call select. Since set_size is probably quite a bit smaller than FD_SETSIZE, it pays to supply the smaller value. Even if set_size is close to FD_SETSIZE, calculating set_size is so cheap that it's probably almost always still worth it.

    Of course if you're worried about performance to this extent, you need to be looking at poll() instead of select(). Better yet, you need to be looking at epoll and kqueue, except that's not portable since those functions are only available on Linux and FreeBSD (including MacOS) respectively.