Search code examples
javascriptcookiesweb-scrapingheadlesspuppeteer

How to manage log in session through headless chrome?


I want to create a scraper that:

  1. opens a headless browser,
  2. goes to a url,
  3. logs in (there is steam oauth),
  4. fills some inputs,
  5. and clicks 2 buttons.

My problem is that every new instance of headless browser clears my login session, and then I need to login again and again...

How to save it through instances? (using puppeteer with headless chrome)

Or how can I open already logged in chrome headless instance? (if I have already logged in in my main chrome window)


Solution

  • In puppeter you have access to the session cookies through page.cookies().

    So once you log in, you could get every cookie and save it in a json file:

    const fs = require(fs);
    const cookiesFilePath = 'cookies.json';
    // Save Session Cookies
    const cookiesObject = await page.cookies()
    // Write cookies to temp file to be used in other profile pages
    fs.writeFile(cookiesFilePath, JSON.stringify(cookiesObject),
     function(err) { 
      if (err) {
      console.log('The file could not be written.', err)
      }
      console.log('Session has been successfully saved')
    })
    

    Then, on your next iteration right before using page.goto() you can call page.setCookie() to load the cookies from the file one by one:

    const previousSession = fs.existsSync(cookiesFilePath)
    if (previousSession) {
      // If file exist load the cookies
      const cookiesString = fs.readFileSync(cookiesFilePath);
      const parsedCookies = JSON.parse(cookiesString);
      if (parsedCookies.length !== 0) {
        for (let cookie of parsedCookies) {
          await page.setCookie(cookie)
        }
        console.log('Session has been loaded in the browser')
      }
    }
    

    Checkout the docs: