I'm facing an issue trying to scrape datas on the web with puppeteer and querySelector.
I have a nodeJS WebServer that handle a post query, and then call a function to scrape the datas. I'm sending 2 parameters (postBlogUrl & postDomValue).
PostDomValue will contains as string the selector I'm trying to fetch datas from, for example: [itemprop='articleBody'].
If I manually suggest the selector ([itemprop='articleBody']), everything is working well, I'm able to retrieve datas, but if i use the postDomValue var, nothing is returned.
I already tried to escape the var using CSS.escape(postDomValue), but no luck.
fetchBlogContent: async function(postBlogUrl, postDomValue) {
try {
const puppeteer = require('puppeteer');
const browser = await puppeteer.launch();
page = await browser.newPage();
await page.goto(postBlogUrl, {
waitUntil: 'load'
})
let description = await page.evaluate(() => {
//This works return document.querySelector("[itemprop='articleBody']").innerHTML;
//This won't return document.querySelector(postDomValue).innerHTML;
})
return description
} catch (err) {
// handle err
return err;
}
}
const description = await page.evaluate((value) =>
document.querySelector(value).innerHTML, JSON.stringify(postDomValue));
See docs on how to pass args to page.evaluate()
in puppeteer