Search code examples
node.jsthreadpoolcontext-switchsingle-threaded

Node.js single thread VS Tranditonal webserver thread pool


I am a newbie to node.js. I am currently reading the book called 'Beignning Node.js' by Basarat Ali Syed.

Here is an excerpt from it which states the disadvantage of thread pool of traditional web servers:

Most web servers used thread pool this method a few years back and many continue to use today. However, this method is not without drawbacks. Again there is wasting of RAM between threads. Also the OS needs to context switch between threads (even when they are idle), and this results in wasted CPU resources.

I don't quite understand why there is context switch between threads inside thread pool. As far as I could understand, one thread will last during the duration of a task. And once the task is completed, the thread will be free to receive the next task.

So My Q1: Why does it need context switch? When will the context switch between threads happen?

My Q2: Why does not node.js use multiple threads to handle events in the event queue? Isn't it more efficient and reduce the queuing time of events?


Solution

  • Context switch is when the OS need to run more threads than there are CPU cores. Say for example you have 10 threads. And they are all busy (meaning none of them have finished completing their tasks). But your CPU is only a dual core CPU (assume no hyperthreading for simplicity). So, how can all 10 threads run? It's not possible!!

    The answer is context switch. The OS, when presented with lots of processes and threads to execute, will allocate a certain amount of time for each thread to run. After this time the OS will switch to another thread so that all threads will get some time to use the CPU.

    The term "context switch" refers to the fact that when the OS needs to give the CPU to another thread/process it needs to copy all the values in registers temporarily to that thread's memory otherwise the other process/thread will mess up the calculation of the switched thread when it resumes. The OS will also need to re-point the virtual memory tables so that two processes will not mess up each other's memory. How expensive this operation is depends on the CPU architecture. Some architectures like the Sparc are optimized for context switching. Hyperthreading is a feature that implements context switching in hardware so it's faster (but then again, you only get one extra context per CPU with Hyperthreading as implemented on Intel/AMD64 architecture).

    Not using multiple threads completely avoids context switching. Especially if your program is the only program running. So on a single core CPU, a nonblocking, single-threaded program can often beat a multithreaded program.

    However, it's rare to find a single core CPU these days. The ideal number of threads you'd want to run is equal to the number of cores you have. Doing so would also avoid context switching. But even so, getting a complex multithreaded program to run fast is not easy. It's easier to get a nonblocking singlethreaded program to run fast. And in most web applications a multithreaded program wouldn't have any advantage over a nonblocking singlethreaded program because they're both I/O bound.

    A nonblocking singlethreaded program is basically implementing thread-like behavior in userspace using events. This is sometimes called "green threads" in languages that support syntax that make event-oriented programming look like multithreaded programming.