Search code examples
node.jsnode-cluster

Node create processes as the number of cores


From Node.JS docs:

These child Nodes are still whole new instances of V8. Assume at least 30ms startup and 10mb memory for each new Node. That is, you cannot create many thousands of them.

Conclusion the best thing to do is to fork just as the number of your CPU cores, which is:

var cluster = require('cluster');
var http = require('http');
var numCPUs = require('os').cpus().length;

if (cluster.isMaster) {
  // Fork workers.
  for (var i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  cluster.on('death', function(worker) {
    console.log('worker ' + worker.pid + ' died');
    cluster.fork();
  });
} else {
  // Worker processes have a http server.
  http.Server(function(req, res) {
    res.writeHead(200);
    res.end("hello world\n");
  }).listen(8000);
}

But if lets say we have 4 cores we create 4 processes + the master process, so in total we will have 5 processes which again are more than the cpu's cores.

Is this efficient?


Solution

  • As a general answer, you should fork as many processes as the number of CPUs - 1 since you should leave one core the SO to manage other processes on your server (cron? logrotate? whatever).

    Moreover, the processes you fork could use the same CPU since by default the Processor Affinity is managed by the OS. (here an example to customize it and a good answer on stackoverflow.

    So, it depends mostly on what your application is doing since forking could not have benefited at all. Fork for high-intensity CPU work (like crypto or some blocking event loop actions) will benefit from this optimization.

    For example, using your script and a simple benchmark with

    npx autocannon -c 100 -d 5 -p 10 localhost:8000/:

    ┌───────────┬────────┬────────┬─────────┬─────────┬─────────┬──────────┬────────┐
    │ Stat      │ 1%     │ 2.5%   │ 50%     │ 97.5%   │ Avg     │ Stdev    │ Min    │
    ├───────────┼────────┼────────┼─────────┼─────────┼─────────┼──────────┼────────┤
    │ Req/Sec   │ 46943  │ 46943  │ 71039   │ 79039   │ 68444.8 │ 11810.39 │ 46930  │
    ├───────────┼────────┼────────┼─────────┼─────────┼─────────┼──────────┼────────┤
    │ Bytes/Sec │ 6.1 MB │ 6.1 MB │ 9.23 MB │ 10.3 MB │ 8.9 MB  │ 1.54 MB  │ 6.1 MB │
    └───────────┴────────┴────────┴─────────┴─────────┴─────────┴──────────┴────────┘
    

    Then the same autocannon script to the endpoint without fork:

    var http = require('http')
    
    // Worker processes have a http server.
    http.Server(function (req, res) {
      res.writeHead(200)
      res.end('hello world\n')
    }).listen(8000)
    
    ┌───────────┬─────────┬─────────┬────────┬─────────┬─────────┬─────────┬─────────┐
    │ Stat      │ 1%      │ 2.5%    │ 50%    │ 97.5%   │ Avg     │ Stdev   │ Min     │
    ├───────────┼─────────┼─────────┼────────┼─────────┼─────────┼─────────┼─────────┤
    │ Req/Sec   │ 48063   │ 48063   │ 62303  │ 63167   │ 59632   │ 5807.42 │ 48040   │
    ├───────────┼─────────┼─────────┼────────┼─────────┼─────────┼─────────┼─────────┤
    │ Bytes/Sec │ 6.25 MB │ 6.25 MB │ 8.1 MB │ 8.21 MB │ 7.75 MB │ 755 kB  │ 6.25 MB │
    └───────────┴─────────┴─────────┴────────┴─────────┴─────────┴─────────┴─────────┘
    

    Here without forking is not so much slower!!

    But if we change the endpoint with high-CPU operation:

    http.Server(function (req, res) {
      res.writeHead(200)
      for (var i = 0; i < 999999; i++) {
        // cpu cycle waste
      }
      res.end('hello world\n')
    }).listen(8000)
    

    We will get without fork:

    ┌───────────┬────────┬────────┬────────┬────────┬────────┬─────────┬────────┐
    │ Stat      │ 1%     │ 2.5%   │ 50%    │ 97.5%  │ Avg    │ Stdev   │ Min    │
    ├───────────┼────────┼────────┼────────┼────────┼────────┼─────────┼────────┤
    │ Req/Sec   │ 1671   │ 1671   │ 1790   │ 1870   │ 1784.2 │ 76.02   │ 1671   │
    ├───────────┼────────┼────────┼────────┼────────┼────────┼─────────┼────────┤
    │ Bytes/Sec │ 217 kB │ 217 kB │ 233 kB │ 243 kB │ 232 kB │ 9.88 kB │ 217 kB │
    └───────────┴────────┴────────┴────────┴────────┴────────┴─────────┴────────┘
    

    And then with fork + high-intensity CPU operation with get a great improvement!!

    ┌───────────┬─────────┬─────────┬─────────┬─────────┬─────────┬─────────┬─────────┐
    │ Stat      │ 1%      │ 2.5%    │ 50%     │ 97.5%   │ Avg     │ Stdev   │ Min     │
    ├───────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
    │ Req/Sec   │ 8575    │ 8575    │ 9423    │ 9823    │ 9324    │ 421.9   │ 8571    │
    ├───────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
    │ Bytes/Sec │ 1.12 MB │ 1.12 MB │ 1.22 MB │ 1.28 MB │ 1.21 MB │ 54.7 kB │ 1.11 MB │
    └───────────┴─────────┴─────────┴─────────┴─────────┴─────────┴─────────┴─────────┘
    

    So I think you should not "pre optimize" an HTTP endpoint since it could be a lot of work without effort.