How web server manages its thread pool to serve concurrent requests

How does a web server that maintains a pool of worker threads ensure that when two requests come in simultaneously, the same thread is not dedicated to the two requests? How does it implement: "Each request has its own thread?"

If I were to implement this thread pool, I would use a queue to hold my threads and synchronize all "get thread" operations. But clearly this is inefficient.

So what do the web servers do?

Solution

It depends on the web server.

For instance, Nginx doesn't use multiple threads:

Nginx is one of a handful of servers written to address the C10K problem. Unlike traditional servers, Nginx doesn't rely on threads to handle requests. Instead it uses a much more scalable event-driven (asynchronous) architecture. This architecture uses small, but more importantly, predictable amounts of memory under load. Even if you don't expect to handle thousands of simultaneous requests, you can still benefit from Nginx's high-performance and small memory footprint. Nginx scales in all directions: from the smallest VPS all the way up to clusters of servers.

http://wiki.nginx.org/Main

Apache, on the other hand, can do multi-thread/multi-process, depending on configuration.

If I were to implement a threadpool for a web server, I would probably put requests on a blocking queue.

All threads in the pool would wait on the queue. Once a request comes in, the first available thread would get the request and answer it. If another request comes in, while the first thread is answering, it would automatically be assigned to another thread, thanks to the blocking queue. Once a thread is finished answering, it can simply wait again on the queue and be ready to answer once again. In pseudo-code, this would like :

Web server code to handle request

function onRequestReceived(request)
  requestQueue.put(request)

  //Note: request would be a custom object, containing info about the request, like the tcp connection the headers and possibly additional info

Then, the worker thread would look something like:

function run()
    while true:
        request = requestQueue.get()
        handleRequest(request)

All thread are started at the start of the application. This is about it.