Search code examples
c++network-programmingservertcpwinsock

WSAPoll vs Overlapped WSARecv Performance?


I'm creating a server that must handle 1000+ clients, the method I'm currently using is:

One thread will use WSAAccept to handle incoming connections, it has a threads pool of which each thread will handle multiple clients at a time using WSAPoll.

For example, if a Client has just connected, the Server will find a poller thread that is free and add it to the fdset of WSAPoll of that poller thread, thus the poller thread will handle the new Client connection.

The poller thread will use non-blocking WSAPoll to handle the connections, but then it will use (blocking) recv() to receive the packets.

Server: WSAAccept
    Thread Poller #1: WSAPoll [1, 2, 3, 4, 5]  // max out
        recv[1, 2, 3, 4, 5]
    Thread Poller #2: WSAPoll [6, 7, 8, 9, 10] // max out
        recv[6, 7, 8, 9, 10]
    Thread Poller #3: WSAPoll [11, 12]         // free
        recv[11, 12]
    // create more pollers if all maxed out

It is working fine for me, but then I came across a (might be) better solution using Overlapped socket with WSARecv.

The idea here is to use a non-blocking WSARecv completion callback instead of WSAPoll

    CompletionCallback(){ WSARecv(socket, CompletionCallback); }
Loop:
    socket = WSAAccept();
    WSARecv(socket, CompletionCallback); // CompletionCallback will handle the connection.

Therefore eliminating the need for multithreading and/or WSAPoll I've made a PoC and it seems to be working just fine, but it is one-threaded, I wonder what's the performance of this compared to the old method.

Thanks!


Solution

  • OVERLAPPED I/O scales remarkably well - scaling down/scaling up/scaling out. My tooling uses AcceptEx with OVERLAPPED I/O + Nt Threadpool. And it scales well to hundreds of thousands of connections. https://github.com/Microsoft/ctsTraffic