I am using pupeteer to work with headless Chrome on nodeJS.
I am navigating through a local website and scraping the content, reading all the anchor <a>
URLS and saving their content in files.
const puppeteer = require('puppeteer');
const { URL } = require('url');
const fse = require('fs-extra');
const path = require('path');
puppeteer.launch().then(async browser => {
const page = await browser.newPage();
//Navigate to local website
await page.goto('http://localhost:5976/',{"waitUntil": "networkidle0"});
//Gather all anchors on my webpage and save their URLs in an array
const hrefs = await page.evaluate(() => {
const anchors = document.querySelectorAll('a');
return [].map.call(anchors, a => a.href);
});
browser.close();
//Loop through all the URLs and call them
for (var i = 0; i < hrefs.length; i++) {
start(hrefs[i]);
}
})
//Function to browse URL
async function start(urlToFetch) {
const browser = await puppeteer.launch();
const page = await browser.newPage();
page.on('response', async (response) => {
//Treat content of page
});
await page.goto(urlToFetch, {
waitUntil: 'networkidle2'
});
setTimeout(async () => {
await browser.close();
}, 60000 * 4);
}
On the other hand, in my local website, for every page I am performing an AJAX call on
$(window).on("beforeunload", function() {
//AJAX call
};
I discovered that if I go through my website from a browser, this AJAX call is performed when I leave each page. But when I browse my website from a headless browser through the NodeJS code above, the AJAX call doesn't get called
To verify, I put the AJAX call in DOMContentLoaded
event and it was called from the headless browser. So the problem is with onBeforeUnload
It could be that in my nodeJS code I am not closing every page so the event is not being called.
I was wondering what can I changed the event to, to call AJAX last thing on a page both on headless browsers and normal browsers?
Since pptr v1.4.0, you can pass runBeforeUnload
option to the page.close
method:
await page.close({runBeforeUnload: true});