I'm trying to get puppeteer to send an Authorization
header, without receiving a challenge, for 1st/2nd-party requests only - ie not to 3rd parties, and without unintended consequences. The main goals are to authenticate where needed and to avoid leaking the killer combination of Authorization
+ Referer
Using page.authenticate()
won't work, because it requires a challenge. Using page.setExtraHTTPHeaders()
sets the header, but then sends it on to third parties. Using page.setRequestInterception()
allows me to introduce some conditional logic, and does address the main goals, but it appears to add a bunch of complexity and unintended consequences (eg around caching).
My specific use case is around webfonts, fwiw.
Here's how I've confirmed that the extra header is sent to a 3rd party with page.setExtraHTTPHeaders
(in this case, httpbin)
Serve a simple page with an iframe to httpbin.org/headers:
var http = require('http')
http.createServer(function (request, response) {
console.log(request.headers)
response.writeHead(200)
response.end('<iframe src="http://httpbin.org/headers" width="100%" height="100%"></iframe>\n')
}).listen(8000)
Use puppeteer to fetch that page:
const puppeteer = require('puppeteer');
const url = 'http://localhost:8000';
(async () => {
const browser = await puppeteer.launch()
const page = await browser.newPage()
await page.setExtraHTTPHeaders({ Authorization: 'Basic dXNlcjpwYXNz' })
//await page.authenticate({ username: 'user', password: 'pass' })
await page.goto(url)
await page.screenshot({path: '/tmp/headers.png'})
await browser.close()
})()
Contents of the httpbin.org/headers response (captured on the wire with tcpflow -c
):
{
"headers": {
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
"Accept-Encoding": "gzip, deflate",
"Accept-Language": "en-GB",
"Authorization": "Basic dXNlcjpwYXNz", <----- Authorization is forwarded
"Host": "httpbin.org",
"Referer": "http://localhost:8000/",
"Upgrade-Insecure-Requests": "1",
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/83.0.4103.0 Safari/537.36",
"X-Amzn-Trace-Id": "Root=1-5ecdb903-0c61b77370a47d894aa8aa7c"
}
}
You can use the request.isNavigationRequest()
method to filter out any requests that are not for the main domain to limit when auth headers etc. are applied.
A similar issue was reported on the GitHub puppeteer project which lead to this method being added, the author gave this example usage:
// enable request interception
await page.setRequestInterception(true);
// add header for the navigation requests
page.on('request', request => {
// Do nothing in case of non-navigation requests.
if (!request.isNavigationRequest()) {
request.continue();
return;
}
// Add a new header for navigation request.
const headers = request.headers();
headers['X-Just-Must-Be-Request-In-Main-Request'] = 1;
request.continue({ headers });
});
// navigate to the website
await page.goto('https://example.com');