node.js fork cluster-computing node-cluster

Node.js Clustering- Forking, how much memory is actually used?

http://stackabuse.com/setting-up-a-node-js-cluster/ says

" To be clear, forking in Node is very different than a POISIX fork in that it doesn't actually clone the current process, but it does start up a new V8 instance.

Although this is one of the easiest ways to multi-thread, it should be used with caution. Just because you're able to spawn 1,000 workers doesn't mean you should. Each worker takes up system resources, so only spawn those that are really needed. The Node docs state that since each child process is a new V8 instance, you need to expect a 30ms startup time for each and at least 10mb of memory per instance."

But https://nodejs.org/api/cluster.html says

"There is no routing logic in Node.js, or in your program, and no shared state between the workers. Therefore, it is important to design your program such that it does not rely too heavily on in-memory data objects for things like sessions and login."

If the workers (forked processes) aren't actually clones of the master process, then how is it that there is also no shared state?

I was under the impression that if the master process has a one gigabyte JSON string, then all the child processes would also have clones of that one gigabyte JSON string. So with two children there would be 3gb of memory used. What actually happens?

Solution

On Linux et al. fork() uses copy-on-write semantics, i.e. all of the memory pages of the forked process are shared (not copied) and only those pages that the process wants to modify are copied before the modifications are done. So it's possible to use very little memory even if you have a lot of forked processes, if your modified data is close together, i.e. it uses a small number of actual memory pages.

See: