I want to create a scraper that:
My problem is that every new instance of headless browser clears my login session, and then I need to login again and again...
How to save it through instances? (using puppeteer with headless chrome)
Or how can I open already logged in chrome headless instance? (if I have already logged in in my main chrome window)
In puppeter you have access to the session cookies through page.cookies()
.
So once you log in, you could get every cookie and save it in a json file:
const fs = require(fs);
const cookiesFilePath = 'cookies.json';
// Save Session Cookies
const cookiesObject = await page.cookies()
// Write cookies to temp file to be used in other profile pages
fs.writeFile(cookiesFilePath, JSON.stringify(cookiesObject),
function(err) {
if (err) {
console.log('The file could not be written.', err)
}
console.log('Session has been successfully saved')
})
Then, on your next iteration right before using page.goto()
you can call page.setCookie()
to load the cookies from the file one by one:
const previousSession = fs.existsSync(cookiesFilePath)
if (previousSession) {
// If file exist load the cookies
const cookiesString = fs.readFileSync(cookiesFilePath);
const parsedCookies = JSON.parse(cookiesString);
if (parsedCookies.length !== 0) {
for (let cookie of parsedCookies) {
await page.setCookie(cookie)
}
console.log('Session has been loaded in the browser')
}
}
Checkout the docs: