In the puppeteer documentation i found that i could use
await page.authenticate({ username: 'test', password: 'test' });
To access pages with basic authentication.
But it seems that the handlePageFunction has already done the request.
So how could i do that?
Apify.main(async () => {
const requestQueue = await Apify.openRequestQueue(`PC_${settings.project}_${time}`);
await requestQueue.addRequest({ url: settings.baseUrl });
const crawler = new Apify.PuppeteerCrawler({
requestQueue,
launchPuppeteerOptions: {
headless: settings.headless,
// slowMo: 500,
},
maxRequestsPerCrawl: settings.maxurls,
maxConcurrency: settings.maxcrawlers,
handlePageFunction: async ({ request, response, page }) => {
await page.authenticate({ username: 'test', password: 'test' });
await page.waitFor(settings.waitForPageload);
const requestUrl = request.url
const loadUrl = request.loadedUrl
let isRedirected = false
if (requestUrl !== loadUrl) {
isRedirected = { from: requestUrl, to: loadUrl }
}
You can manipulate the page before it is opened with gotoFunction
.
If you would need to login to a website, you can check this small login example
const crawler = new Apify.PuppeteerCrawler({
gotoFunction: async ({ page, request }) => {
await page.authenticate({ username: 'test', password: 'test' });
return page.goto(request.url, { timeout: 120000 });
},