Search code examples
c++socketskqueue

Understanding kqueue in TCP


I am following tutorials about kqueue (specifically http://eradman.com/posts/kqueue-tcp.html and https://wiki.netbsd.org/tutorials/kqueue_tutorial/), and there are parts I don't understand. Here's my (edited) code:

// assume internal_socket is listening
void run_server(int internal_socket) {
    const int nchanges = 1;
    const int nevents = BACKLOG;

    struct kevent change_list[nchanges];
    struct kevent event_list[nevents];

    int kq = kqueue();

    if (kq == -1) {
        // error
    }

    EV_SET(&change_list, sock_internal, EVFILT_READ, EV_ADD, 0, 0, 0);

    while (true) {
        int nev = kevent(kq, change_list, nchanges, event_list, nevents, NULL);

        if (nev == -1) {
            // error
        }

        for (int i = 0; i < nev; ++i) {
            if (event_list[i].flags & EV_EOF) {
                int fd = event_list[i].ident;
                EV_SET(&change_list, fd, EVFILT_READ, EV_DELETE, 0, 0, NULL);
                if (kevent(kq, &change_list, nchanges, NULL, 0, NULL) == -1) {
                    // error
                }
                close(fd);
            } else if (event_list[i].ident == sock_internal) {
                int fd = accept(event_list[i].ident, ...);
                // do stuff
            } else if (event_list[i].flags == EVFILT_READ) {
                int bytes_read = recv(event_list[i].ident, ...);
                // do stuff
            }
        } // for
    } // while (true)
} // func

I don't understand:

  1. Am I right to set nevents = BACKLOG i.e. the number of concurrent connections? If not, what should nevents be?

  2. Why do I check event_list[i].flags & EV_EOF? My best guess is if the connection failed while the socket was sitting in the queue, then I want to remove that socket from the queue? But why do I call kevent again?

  3. In the same section as the previous point, I call close(fd). Is that correct? The eradman tutorial has some extra witchcraft but I don't understand why.

  4. If I understand correctly, kqueue could return when I am ready to read a partial message. How do I know when the message is complete?

In case it's relevant, I'm working on OS X.


Solution

  • Quick thoughts about the code / questions:

    1. No.

      There's no requirement that BACKLOG == nevents.

      You can pick the events out of the queue one at a time or dozens at a time, it's mostly about your preference and minimizing system calls vs. memory/stack space.

      It's also not very often that you would have all the connections simultaneously firing events... there's no real point in spending that much memory - especially when you consider large concurrency, might mean the large memory might cause cache misses and possibly incur performance penalties.

    2. EV_EOF

      Filters may set this flag to indicate filter-specific EOF condition

      Which means you must specify a filter that might raise this flag. Which filters do that? are these events you're listening for?

      You can find these in the man page

      One example, in sockets, is when the client disconnects the read capacity but leaves the write capacity of the socket in place. Than, the EVFILT_WRITE filter (if set) will invoke the EV_EOF flag.

      Personally, I think these edge cases can be checked when write fails instead of having an event raised.

    3. calling close is reasonable... really depends on what you want. I might keep th connection only to read data. Or maybe invoke shutdown, as it's considered more polite (but probably doesn't matter all that much these days).

    4. You don't. This isn't a TCP/IP concern.

      Message wrapping should be performed by the protocol being implemented (i.e. Websockets / HTTP). Each protocol has a different message wrapping / completion design.

      The TCP/IP layer wraps packets. In the wild, these are often limited to 1500 bytes and many parts of the internet run on 576 bytes. You can google MTU for more information.

    Side-Note:

    • You probably want to add new clients to the kqueue.

    • I would consider resetting the nchanges value every cycle.