Search code examples
javascriptnode.jsforkworkflow

Forking tasks workflow in Javascript


I'm doing some tests to learn to fork different tasks in JavaScript as I'm new to the lenguage. I'm trying to sum every thre number group from a plain text file formated as following:

199
200
208
210
200
207

(199, 200, 208) is the first group, (200, 208, 210) is the second one, etc...

I read from the file, splited the string and got my array of strings. Now I want to do the adding in a loop that forks every iteration (in the subprocess is where the sum is being made) and print the resulting array of summed numbers.

parent.js

const fs = require('fs');
const { fork } = require('child_process');

const readString = fs.readFileSync('depth_readings_p2.txt', 'utf8');
const readArray = readString.split('\n');

var numArrayDef = [];

for (let i = 0; i < readArray.length - 2; i++) {
    let msg = {
        i,
        readArray
    };

    let childProcess = fork('function.js');

    childProcess.send(msg);

    childProcess.on('message', (m) => {
        console.log(m);
        numArrayDef.push(m);
    });

    console.log(numArrayDef[i]);
}

As you see I'm sending the subprocess and object that includes the index, the array of strings and the array where the summed number will be stored. The parent process recieves the summed number and stores it in numArrayDef.

function.js

process.on('message', (msg) => {
    let num = 0;

    if ((msg.i + 2) < msg.readArray.length) {
        num += parseInt(msg.readArray[msg.i]);
        num += parseInt(msg.readArray[msg.i + 1]);
        num += parseInt(msg.readArray[msg.i + 2]);

        process.send(num);
    }

    process.exit();
});

In the output I can see that the parent is receiving everything correctly, but the program isn't pushing the received values into the result array. Also, the order of execution is weird: - First, everything in the loop but the message receiving block. - Second, everything after the loop ends. - Finally, the message receiving block.

undefined
undefined
undefined
undefined
undefined
undefined
undefined
undefined
[]
607
618
618
617
647
716
769
792

I know I'm missing something about forking processes, but I don't know what is it and I don't see it in the fork documentation.


Solution

  • What you have to understand in nodejs is it's asynchronious nature, the code is not really executed in order as you have written it! (atleast, a lot of times..)

    The childProcess is a process handle which will be returned immediatly. But the forked process itself may take some time to start. What you do, is to add a callback which will be executed every time, a message event is received. Check this code:

    parent.js

    let childProcess = fork('function.js');
    
    // this line is executed immedatly after the handle is created.
    // You pass a newly created function to the ".on()" function which will be 
    // called everytime, the child process sends a "message" event.
    // you want to understand, that you just declare an anonymious `function`
    // and pass it as argument. So the executed function has actually to decide
    // when to call it!.
    childProcess.on('message', (m) => {
        console.log('message received in parent:', m)
        console.log('closing the process')
        childProcess.kill('SIGINT')
    });
    
    
    childProcess.on('exit', () => {
        console.log('child is done!')
    })
    
    childProcess.send('I will come back!')
    
    console.log('last line reached. Program still running.')
    
    

    function.js

    process.on('message', (msg) => {
    
        // wait a few seconds, and return the message!
        setTimeout(() => {
            process.send(msg)
            // wait 2000ms
        }, 2000)
    }
    

    output

    last line reached. Program still running.
    message received in parent: I will come back!
    closing the process
    child is done!
    

    execution order

    • Fork a process and get it's handle. Code execution goes on!
    • Register callback listeners which will be called on given events like message OR exit. These are actually asynchronious. You don't know when they kick in.
    • log that all lines have been executed
    • Some time later, the message listener and after it, the exit listener kick in.

    your code

    You code basically executes till the end (only adding handlers to a process handle) and will log data from numArrayDef which is not currently added to it. So if no element is present at numArrayDef[5], it will log undefined per default.

    callbacks

    Since nodejs is single threaded per default, it's common to xecute an asynchronious function and pass it a callback (just another function) which will be executed when your called function is done!

    The fixed code

    parent.js

    const fs = require('fs');
    const { fork } = require('child_process');
    const { EOL } = require('os')
    
    const readString = fs.readFileSync('file.txt', 'utf8');
    const readArray = readString.split(EOL);
    
    var numArrayDef = [];
    
    for (let i = 0; i < readArray.length - 2; i++) {
    
        // msg building. Done instantly.
        let msg = {
            i,
            readArray
        };
    
    
        // forking a childprocess. The handle is retunred immediatly
        // but starting the process may be taking some time, and 
        // the code won't wait for it!.
        let childProcess = fork('function.js');
    
        // this line is executed immedatly after the handle is created.
        // You add a so 
        childProcess.on('message', (m) => {
            console.log('message recevied', m)
            numArrayDef.push(m);
    
            // log if all numbers are done.
            if(numArrayDef.length === readArray.length -2) {
                console.log('Done. Here\'s the array:', numArrayDef)
            }
    
        });
    
        childProcess.send(msg);
    }
    

    function.js

    process.on('message', (msg) => {
        let num = 0;
    
        if ((msg.i + 2) < msg.readArray.length) {
            num += parseInt(msg.readArray[msg.i]);
            num += parseInt(msg.readArray[msg.i + 1]);
            num += parseInt(msg.readArray[msg.i + 2]);
    
            process.send(num);
        }
    
        process.exit();
    });
    

    This should give you an idea. I recommend, going for some tutorials in the beginning to understand the nature of the language.

    What you should learn about nodejs

    • Learn what a callback is
    • Basic understanding of async/await and Promises is a must
    • You should learn, what operations are sync and which ones are async
    • Eventemitter class is also used very often
    • Learning how to handle childprocess or fork and other similar stuff, is not really required to get the base understanding of nodejs

    Function declaration

    Just an addon to the syntax. These are almost exactly the same. Except that with the arrow function style the this context will be correctly applied to newly created function:

    // variant1
    function abc(fn) {
        // eexcute the argument which is a function. But only after
        // a timeout!. 
        setTimeout(fn, 2000)
    } 
    
    // variant2
    const abc = function(fn) {
        // eexcute the argument which is a function. But only after
        // a timeout!. 
        setTimeout(fn, 2000)
    }
    
    // variant3
    const abc = (fn) => {
        // eexcute the argument which is a function. But only after
        // a timeout!. 
        setTimeout(fn, 2000)
    }
    
    // call it like so:
    abc(function() { 
       console.log('I was passed!!.')
    })
    
    console.log('The abc function was called. Let\'s wait for it to call the passed function!\')