Search code examples
javascriptnode.jsasync-awaites6-promise

Parallel HTTP requests in batches with async for loop for each request


I am trying to run parallel requests in batches to an API using a bunch of keywords in an array. Article by Denis Fatkhudinov.

The problem I am having is that for each keyword, I need to run the request again with a different page argument for as many times as the number in the pages variable.

I keep getting Cannot read property 'then' of undefined for the return of the chainNext function.

The parallel request in batches on its own, without the for loop, works great, I am struggling to incorporate the for loop on the process.

// Parallel requests in batches
async function runBatches() {
  // The keywords to request with 
  const keywords = ['many keyword strings here...'];
  // Set max concurrent requests
  const concurrent = 5;
  // Clone keywords array
  const keywordsClone = keywords.slice()
  // Array for future resolved promises for each batch
  const promises = new Array(concurrent).fill(Promise.resolve());
  // Async for loop
  const asyncForEach = async (pages, callback) => {
    for (let page = 1; page <= pages; page++) {
      await callback(page);
    }
  };
  // Number of pages to loop for
  const pages = 2;

  // Recursively run batches
  const chainNext = (pro) => {
    // Runs itself as long as there are entries left on the array
    if (keywordsClone.length) {
      // Store the first entry and conviently also remove it from the array
      const keyword = keywordsClone.shift();
      // Run 'the promise to be' request
      return pro.then(async () => {
        // ---> Here was my problem, I am declaring the constant before running the for loop
        const promiseOperation = await asyncForEach(pages, async (page) => {
          await request(keyword, page)
        });
        // ---> The recursive invocation should also be inside the for loop
        return chainNext(promiseOperation);
      });
    }

    return pro;
  }

  return await Promise.all(promises.map(chainNext));
}

// HTTP request
async function request(keyword, page) { 
  try {
    // request API 
    const res = await apiservice(keyword, page);
    // Send data to an outer async function to process the data
    await append(res.data);
  } catch (error) {
    throw new Error(error) 
  }
}


runBatches()

Solution

  • I got it working by moving actual request promiseOperation inside the for loop and returning the recursive function there too

    // Recursively run batches
    const chainNext = async (pro) => {
      if (keywordsClone.length) {
        const keyword = keywordsClone.shift()
        return pro.then(async () => {
          await asyncForEach(pages, (page) => {
            const promiseOperation = request(keyword, page)
            return chainNext(promiseOperation)
          })
        })
      }
      return pro
    }
    

    Credit for the parallel requests in batches goes to https://itnext.io/node-js-handling-asynchronous-operations-in-parallel-69679dfae3fc