Search code examples
javascriptnode.jspuppeteerheadless-browser

My loop in Puppeteer isn't looping when creating PDF files from webpages


I'm attempting to use Puppeteer to scrape about 300 webpages to PDF, but my loop isn't working. The intent is that Puppeteer loads each page from an array, generates a PDF, and then works through all of the URLs before closing.

Using the code below, Puppeteer successfully scrapes the first URL -- and then stops.

Code (URLs are placeholders):

const puppeteer = require('puppeteer');

(async () => {
  // Create a browser instance
  const browser = await puppeteer.launch({ headless: true });

  // Create a new page
  const page = await browser.newPage();

  // Set viewport width and height
  await page.setViewport({ width: 1280, height: 720 });

  const urlArray = [
    'https://ask.metafilter.com/369890/Patio-furniture-designed-for-the-PNW',
    'https://ask.metafilter.com/369889/Its-the-police-should-I-document-my-concern',
    'https://ask.metafilter.com/369888/Training-my-over-excited-dog'
  ];

for(var i = 0; i < urlArray.length; i++) {

  const website_url = urlArray[i];

  // Open URL in current page
  await page.goto(website_url, { waitUntil: 'networkidle0' });

  // Download the PDF
  const pdf = await page.pdf({
    path: 'images/page_${i+1}.pdf',
    margin: { top: '100px', right: '50px', bottom: '100px', left: '50px' },
    printBackground: true,
  });


}
  // Close the browser instance
  await browser.close();
})();

However, if I attempt to create a screenshot, swapping out this:

// Download the PDF
  const pdf = await page.pdf({
    path: 'images/page.pdf',
    margin: { top: '100px', right: '50px', bottom: '100px', left: '50px' },
    printBackground: true,
  });

For this:

// Capture screenshot
  await page.screenshot({
    path: `images/screenshot_full_${i+1}.jpg`,
    fullPage: true
  });

It loops fine, and goes through every URL in the array.

What am I missing?

I'm working from these tutorials: https://www.bannerbear.com/blog/how-to-make-a-pdf-from-html-with-node-js-and-puppeteer/, https://www.bannerbear.com/blog/how-to-take-screenshots-with-puppeteer/


Solution

  • As @ggorlen pointed out, I was using single quotation marks where I should have had backticks:

    path: 'images/page_${i+1}.pdf'
    

    Should be:

    path: `images/page_${i+1}.pdf`