Search code examples
javascriptnode.jsstreamspawn

Best way to get back info from spawned child process


I have a strange problem.

I have to spawn a process in my node app (external, no fork possible). this child process then sends output I need to get back and store in a db. The way I do it now is to echo each line of my data (It's JSON) and listen for what comes on stdout.

child code :

var cntSent=0
for (var j = 0, lUF = uniqueFlyers.length; j < lUF; j++) {
  var products = uniqueFlyers[j].products;
  for (var k = 0, lP = products.length; k < lP; k++) {
    var pstr = products[k].product;
    this.echo(pstr);
    cntSent+=1;
  }
}
console.log(cntSent);

at the end, cntSent=10000.

Node side :

var cntReceived
proc.stdout.on('data', function(line) {
  cntReceived+=1;
  console.log(line);
});
proc.on('close', function (code) {
  console.log(cntReceived);
});

at the end, cntReceived = 3510.

I can see all my data outputted, but it's aggregated together and comes in big chunks. My idea is to write to a file, then process the file with node, but it seems redundant and I'd like to start processing the data as it comes. Any suggestions as to the most accurate and fast way?

EDIT: as usual, writing down the questions made me think. Am I just being silly and would be better off buffering the data, then parsing it? It's JSON goddamnit!


Solution

  • There is no need to write the data to a file, then process the file; nor do you need to buffer the whole data before processing it.

    If the data you're outputting is in JSON, I'd suggest then using JSONStream in the parent code. This will allow you to parse the output on the fly. Below is an example.

    The child code will output a JSON array:

    // Child code
    console.log('['); // We'll output a JSON array
    for (var j = 0, lUF = uniqueFlyers.length; j < lUF; j++) {
      var products = uniqueFlyers[j].products;
      for (var k = 0, lP = products.length; k < lP; k++) {
        var pstr = products[k].product;
        console.log(JSON.stringify(pstr)); // output some JSON
        if ((j !== lUF - 1) && (k !== lP - 1))
            console.log(','); // output commas between JSON objects in the array
        cntSent+=1;
      }
    }
    console.log(']'); // close the array
    

    While the parent code will read this JSON array, and process it. We use the * selector in order to select all elements of the array. The JSONStream will then emit each JSON document one by one, as they are parsed. Once we have this data, we can then use a Writable stream, that will read the JSON objects and then do something (anything!) with them.

    // Parent code
    var stream = require('stream');
    var jsonstream = require('JSONStream').parse('*');
    var finalstream = new stream.Writable({ objectMode: true }); // this stream receives objects, not raw buffers or strings
    finalstream._write = function (doc, encoding, done) {
        console.log(doc);
        done();
    };
    
    proc.stdout.pipe(jsonstream).pipe(finalstream);