Search code examples
javascriptasync-awaitpuppeteersettimeoutes6-promise

How to catch an error on a async callback function on outer try/catch block


Ok,

So I am using the puppeteer framework and I have an async function that interact with a webpage. This function clicks and selects and elements of a webpage while it waiting for the traffic of the page to be idle. This function works most of the time, but sometimes it stalls.

I want to be able to set a timeout so that if the function is taking longer than a certain amount of time, it throws an error and I can run it again. So far I cannot seem to get this to work because I cannot get the callback function I pass to setTimeOut() to 'interact' with the outer function.

My code looks like this:

const scrap_webtite = async page => {
 /* scrap the site */
    try{ // catch all
       // set timeout 
        let timed_out_ID = setTimeout(()=> throw "timeOut", 1000);
       // run the async
        let el = await sometimes_stalls_function(page);
       // if function ran finished correcly     
       clearTimeout(timed_out_ID);
        // save el
        save_el(el);
        }
    }catch(e){
       console.error("Something went wrong!", e);
       // this makes the function run again
       // here is where I want to ideally catch the timeout error
        return false
    }
}

I have also tried wrapping the setTimeOut function in an Promise as per this post and the using the .then().catch() callbacks to try to catch the error to no avail.

Apologies if this is a stupid question, thank for you help.


Solution

  • The problem you're running into is essentially that the error thrown in setTimeout() is not related to your function flow, and thus can't be caught there. You can essentially think of the timer's callback function as a "detached" function: the variables from the parent scope will still be available, but you can't return a value to the parent directly etc.

    To work around this problem you have a few options, Promise.race() is one possible solution. The idea is to first make an async version of a timeout:

    const rejectAfter = (timeout) => {
      return new Promise((resolve, reject) => {
        setTimeout(() => reject(), timeout);
      });
    };
    

    Then extract your business logic out into a separate async function a-la:

    const doTheThing = async () => {
      // TODO: Implement
    };
    

    And finally in your scraping function, use Promise.race() to use the result from whichever of the two finishes first:

    const scrape = async (page) => {
      try {
        const el = await Promise.race([
          rejectAfter(1000),
          doTheThing()
        ]);
      } catch(error) {
        // TODO: Handle error
      }
    }