Search code examples
javascriptnode.jsweb-scrapingpuppeteerdom-traversal

Querying a selector that is inside a frame


I am building a simple scraper with Puppeteer/JS.

I am trying to get an array of the paragraphs off a page and html is as seen in [this image][1].

When I use the id (#iframeContent), I get nothing. When I try to use a deep indicator, as such;

await page.$eval('#bookDesc_iframe_wrapper > iframe')

it loses track trying to hit > document or > #document.

When in the dev console on google, I can only find by query selector if I have gone and opened up that document > html > body manually, otherwise even google console doesn't see #iframeContent.


Solution

  • You cannot use selectors across frames. You first have to find the frame and then work inside the frame. Use page.frames() to get a list of all frames of the page and frame.name() to identify your target frame.

    You can then execute functions like frame.$$ or frame.evaluate as you would on a page.

    The code could look like this:

    const frames = await page.frames();
    const iframe = frames.find(f => f.name() === 'bookDesc_iframe'); // name or id for the frame
    
    const paragraphs = await iframe.$$('p');