Search code examples
javascriptnode.jspromisebluebirdnode-async

Iterate large number of async calls / results in node.js (using ES6 / async / bluebird / generators)?


I am writing a utility in node.js that has to process and concatenate a large number of files every night. In synchronous pseudocode it would look like that (omitting try / catch for clarity):

while (true) {
    var next = db.popNext();
    if (!next) return;

    out.append(next);
}

However, in the library I am using popNext() is actually a node-style asynchronous method and rather looks like this: popNext(callback).

Since I am writing the middleware from scratch I could use --harmony (e.g., generators), async or bluebird.

Ideally I would prefer something like:

forEachOrdered(db.popNext, (error, next, ok, fail) => {
   if(error) return; // skip

   // If there was an internal error, terminate the whole loop.
   if(out.append(next)) ok();
   else fail();
}).then(() => {
   // All went fine.
}).catch(e => {
   // Fail was called.
});

However, I am open to other 'standard' solutions. I was wondering what would be the most concise solution to this problem?

Edit Just spawning all (in a regular for loop) at the same time would probably not solve my problem since we're talking about 100k's and for every item I have to open and read a file, so I would probably run out of file descriptors.


Solution

  • Here is a solution using bluebird coroutines using your "ideal" code:

    var db = Promise.promisifyAll(db);
    
    var processAll = Promise.coroutine(function*(){
      while(true){
        var next = yield db.popNextAsync(); // promisify gives Async suffix
        if(!next) return;
        out.append(next); // some processing
      }       
    });
    

    In ES2016 (ES7) this becomes:

    var db = Promise.promisifyAll(db); // still need to promisify
    
    async function processAll(){
      let next;
      while(next = await db.popNextAsync()){
         // whatever
         out.append(next);
      }
    }
    

    Although, I'd argue the output collection should be an iterable (and lazy) too, so using ES2016 async iterators:

    var db = Promise.promisifyAll(db);
    async function* process(){
        while(true){
           var val = await db.popNextAsync();
           if(!val) return;
           // process val;
           yield process(val); // yield it forward
        }
    }
    

    Although if we really want to go all out here, after converting db.popNext into an async iterator, this becomes in ES2016 for await notation:

    async function* processAll(){
        for await(let next of db.asAsyncIterator()){ // need to write this like above
           yield process(next); // do some processing
        }
    }
    

    Leveraging the whole ES2016 async iteration API. If you can't, or don't want to use generators you can always convert while loops to recursion:

    function processAll(){ // works on netscape 7
       return db.popNextAsync().then(function next(value){
          if(!value) return;
          out.push(process(value));
          return db.popNextAsync().then(next); // after bluebird promisify
       });
    }