Search code examples

Puppeteer Not Triggering Click Before Returning HTML

My Node.js puppeteer script fills out a form successfully, but the page only accepts a "click" event on an element some of the time before returning the modified page content. Here's the script:

const fetchContracts = async (url) => {
    const browser = await pupeteer.launch({ headless: true, args: ['--no-sandbox', '--disable-setuid-sandbox']});
    const page = await browser.newPage();
    const pendingXHR = new PendingXHR(page);

    await page.goto(url, { waitUntil: 'networkidle2' });
    await Promise.all(["#agree_statement"),


    await Promise.all([".btn-primary"),

    /// Sometimes these clicks do not register....
    await'#filedReports th:nth-child(5)')
    await pendingXHR.waitForAllXhrFinished();
    await'#filedReports th:nth-child(5)');
    await pendingXHR.waitForAllXhrFinished();

    /// And my bot skips directly here....
    let html = await page.content();
    await page.close();
    await browser.close();
    return html;


The "pendingXHR" module is an import, which I pull in up top in my code from this library:

const { PendingXHR } = require('pending-xhr-puppeteer');

The script works on my local computer, and works some of the time when I upload the script to Digital Ocean. According to the page that I am crawling, these clicks initiate XHR requests, which I am attempting to wait for. Here's proof:

enter image description here

So my question is:

Why would these clicks not register, even though I am awaiting them and awaiting the XHR requests, before the html is pulled from the page and then returned? And why the inconsistency with this, where sometimes the clicks are registered and sometimes they are not?

Thanks for your help.


  • Short answer: The click will lead to a delayed AJAX request and therefore pendingXHR.waitForAllXhrFinished() will immediately resolve as there are no requests happening at the time the function is executed. Use page.waitForResponse('.../data/') instead.


    You are expecting the following process of events to happen:

    1. Click happens
    2. AJAX request starts
    3. pendingXHR.waitForAllXhrFinished() executed
    4. AJAX request finishes
    5. Table is rendered
    6. pendingXHR.waitForAllXhrFinished() resolves
    7. page.content() executed

    The problem is that the library (PendingXHR) you are using waits for the currently pending requests and resolves as soon as they are resolved. This does not work in two cases that I can think of:

    1. The AJAX request is started asynchronously

    In this case, the order of the events would be like this:

    1. Click happens, but starts the AJAX call asynchronously (later)
    2. pendingXHR.waitForAllXhrFinished() executed
    3. pendingXHR.waitForAllXhrFinished() resolves immediately (as there are no requests)
    4. page.content() executed (too early!)
    5. AJAX request starts
    6. AJAX request finishes
    7. Table is rendered

    2. The UI modifies the table asynchronously

    In this case, the order of the events would be like this:

    1. Click happens
    2. AJAX request starts
    3. pendingXHR.waitForAllXhrFinished() executed
    4. AJAX request finishes (but the code renders the table later)
    5. pendingXHR.waitForAllXhrFinished() resolves
    6. page.content() (too early!)
    7. Table is rendered

    The inconsistency happens as sometimes the events might be in the right order as this is a case in which a millisecond can decide what happens first.


    Without looking at the code of the page, I cannot say which case it is for sure (it might actually be both), but I would guess it is the first one as I can totally see the table library to wait for any double clicks/dragging/etc. to happen before it makes the AJAX request.

    The first problem can be fixed by using page.waitForResponse instead of pendingXHR.waitForAllXhrFinished as this makes sure that the request to data/ has actually happened.

    Fixing the second case (if necessary) is not that trivial, but can be done by introducing a fixed waiting time by using page.waitFor(10).

    By fixing both cases, the new code looks like this:

    await Promise.all([ // wait for the response to happen and click
        page.waitForResponse('.../data/'), // use the actual URL here'...'),
    await page.waitFor(10); // wait for any asynchronous rerenders that might happen
    let html = await page.content();