Search code examples
node.jsmultithreadingworker-thread

When a workerThread is created in nodejs, does it utilize the same core in which nodejs process is running?


Let's assume i have a nodejs serverProgram with one api and it does some manipulations on the video file, sent via the http request.

const saveVideoFile=(req,res)=>{
  processAndSaveVideoFile(); // can run for minimum of 10 minutes
  res.send({status: "video is being processed"})
}

i decided to to make use of a workerThread to do this processing as my machine has 3 cores (core1,core2,core3) and there is no hyperthreading enabled here

Assume that my nodejs program is running on core1. When i fire up a single workerThread, will the workerThread run on core2/core3 or core1?

i read that workerThread is not the same as childProcess. ChildProcess will fork a new process which will facilitate the childProcess to choose from available free cores (core2 or core3).

i read that workerThread shares memory with the mainThread. Let's assume that i create 2 workerThreads (wt1,wt2). Will my nodejs program, wt1, wt2 run on the same core i.e core1 ?

Also, in nodejs we have eventloop (mainthread) and otherThreads doing the background operations i.e I/O. is it correct to assume that all of these are utilizing the resources available in a single core (core1). if this is the case, is creating and using additional workerThread's an overkill on the nodejs server?

Below is an excerpt from this blog

We can run things in parallel in Node.js. However, we need not to create threads. The operating system and the virtual machine collectively run the I/O in parallel and the JS code then runs in a single thread when it is time to send the data back to the JavaScript code.

i keep reading this same information about nodejs in many articles and video presentations. But what i do not understand is this,

The operating system and the virtual machine collectively run the I/O in parallel

How can the operating system run the I/O requests from nodejs program in parallel without using any of the childProcess or threads spawned from nodejs? if those I/O requests from nodejs program is running in parallel, does it mean that all 3 cores (core1,core2,core3) will be utilized?

There are lot of contents on nodejs, but it doesn't clear doubts related to my above questions. if you have idea on how these things actually work, please share the detail.


Solution

  • A worker thread in node.js is an actual OS thread running in a different instance of V8. As such, it's totally up to the operating system to decide how to allocate it among available CPU cores. If there are cores with available time, then it will not generally be run on the same core as the main nodejs thread when that thread is busy because the OS will allocate busy threads across the various cores.

    But, again this is entirely up to the OS and is not something that nodejs controls and the exact strategy for which cores are used will vary by OS. But, in all modern operating systems, the design goal is that available cores are used for threads that are currently executing. Now, if there are more threads active at once than there are cores, the threads will be time-sliced and all the cores will be active.

    Also, in nodejs we have eventloop (mainthread) and otherThreads doing the background operations i.e I/O. is it correct to assume that all of these are utilizing the resources available in a single core (core1). if this is the case, is creating and using additional workerThread's an overkill on the nodejs server?

    No, it is not correct to assume those threads all use the same core.

    A workerThread in nodejs has its own event loop. For the most part, it does not share memory. In fact, if you want to share memory, you have to very specifically allocated SharedMemory and pass that to the workerThread.

    Is it overkill? Well, it depends upon what you're doing. There are very useful things to do with workerThreads and there are things that they would not be necessary for.

    The operating system and the virtual machine collectively run the I/O in parallel

    I/O in node.js is either asynchronous at the OS level (such as networking) or run in separate threads (such as disk I/O). That means it runs separately from the main thread in node.js that runs your Javascript and can run in parallel with it, synchronizing only at the completion of an event. "Parallel" in this case means that both make progress at the same time. If there are multiple cores, then they can truly be running at exactly the same time. If there was only one core, then the OS will timeslice between the various threads and they will be both make progress (in an interleaved fashion that will seem to be parallel, but really they are taking turns).

    How can the operating system run the I/O requests from nodejs program in parallel without using any of the childProcess or threads spawned from nodejs? if those I/O requests from nodejs program is running in parallel, does it mean that all 3 cores (core1,core2,core3) will be utilized?

    The OS has its own threads for managing things like a network interface or a disk interface. The job of those threads is to interface with the hardware and bring data to an appropriate application or take data from the application and send it to the hardware. These are OS-level threads that exists independent of node.js. Yes, other cores can be used by those OS-level threads. It is important to realize that many operations such as networking are inherently non-blocking. Thus, if you're waiting for some data to arrive on a network interface, you don't need to have a thread doing something the whole time.


    I want to add that it appears in your questions that you've combined questions about a several different things. Mentioned in your questions are:

    1. Worker Threads
    2. Internal node.js threads
    3. Operating system threads

    These are all different things.

    A worker thread is a new thread you can start to run specific pieces of Javascript in another thread so you can have more than one Javascript thread running at the same time. In node.js, this is done by creating a whole new instance of V8, setting up a whole new global environment and loaded modules environment and using almost entirely separate memory.

    Internal node.js threads are used by node.js as part of implementing its event loop and its standard library. Specifically, disk I/O and some crypto operations are run in internal native threads and they communicate with your Javascript via events/callbacks through the event loop.

    Operating system threads are threads that the OS uses to implement it's own system APIs. Since the OS is responsible for lots of things, these threads ca have many different uses. Depending upon native implementations, they may be used to facilitate things like disk I/O or networking I/O. These threads are the responsibility of the OS to create and use and are not directly controlled by node.js.


    Some additional questions asked in comments:

    what is the difference b/w workerThread & childProcess concept in nodejs? is childProcess = workerThread without sharedMemory ?

    A child process can be any type of program - it does not have to be a node.js program. A worker thread is node.js code.

    A worker thread can share memory if sharedMemory is specifically allocated and shared with the worker thread and if it is carefully managed for concurrency issues.

    It is more efficient to copy memory back and forth between worker thread and main thread than with child process.

    If main program exits, worker threads will exit. If main program exits, child process can be configured to exit or to continue.

    If worker thread calls process.exit(), the main thread will exit too. If child program exits, it cannot cause main program to exit without main program's cooperation.

    how nodejs is able to magically interact with the os level thread without nodejs itself creating any threads?, i need additional details on this, your explanation is the common one present in most places including the blog i shared?

    nodejs just calls an OS API. It's the OS API that manages communicating with its own threads (if threads are needed for that specific OS API). How it does that communication internally is implementation dependent and will vary by OS. It will even vary by OS which OS APIs use threads and which don't.