Search code examples
javascriptnode.jsmultithreadingasynchronouslighthouse

interject synchronous code into asynchoronous


I'm running the lighthouse cli against a ~50 site list of sites. I'm simply running it in a .forEach loop, which, if I understand, is blocking, aka, synchronous. However, I end up spinning up 50 Chrome Canary instances all at once. In my limited understanding of these things I think the thread is started synchronously, but then node can pass the thread off to the kernel and happily start the next. Again, that's just my hand-wavy understanding of what's going on.

I'm using this function that I cribbed from somewhere:

function launchChromeAndLighthouse(url, opts, config = null) {
  return chromeLauncher.launch({chromeFlags: opts.chromeFlags}).then(chrome => {
    opts.port = chrome.port;
    return lighthouse(url, opts, config).then(results =>
      chrome.kill().then(() => results));
  });
}

I tried nextTick in the loop:

asyncFuncs().then( async (sites) => {
  sites.forEach( (site) => {
    process.nextTick(launchChromeAndRunLighthouse(site.url, opts))
  })
})

But this still spawns a bunch of Chrome instances. How do I pause execution while one lighthouse completes?


Solution

  • Since launchChromeAndRunLighthouse() returns a promise to mark when it's done, if you want to only run them serially one at a time, you can switch to a for loop and use await:

    asyncFuncs().then( async (sites) => {
      for (let site of sites) {
        await launchChromeAndRunLighthouse(site.url, opts);
      }
    });
    

    If you're trying to collect all the results:

    asyncFuncs().then( async (sites) => {
        let results = [];
        for (let site of sites) {
          let r = await launchChromeAndRunLighthouse(site.url, opts);
          results.push(r);
        }
        return results;
    }).then(results => {
        // all results here
    }).catch(err => {
        // process error here
    });
    

    If you want to run N chrome instances at a time such that it initially starts up N instances and then each time one finishes, you launch the next one that's waiting, that is more complicated to keep track of how many instances are running. There's a helper function call pMap() or mapConcurrent() that can do that for you in these answers:

    Make several requests to an API that can only handle 20 request a minute

    Promise.all consumes all my RAM


    The Bluebird Promise library also has concurrency control in its Promise.map() function.