Search code examples
lighthouse

How to retrieve page HTML from Lighthouse?


I'm trying to add some custom metrics to Lighthouse. In order to do some basic checks, I need to get the raw HTML of the webpage.

I've tried without any success using driver.sendCommand (DOM.getDocument or DOM.getFlattenedDocument) and driver.querySelectorAll('html') and driver.evaluateAsync('document.documentElement.outerHTML'). How can I manage to get the raw HTML from Chrome into Lighthouse?

Thank you,

Fabio


Solution

  • This should be a straightforward call to driver.evaluateAsync from withing a gatherer. Something like:

    const expression = `document.querySelector('html').outerHTML`;
    const html = await passContext.driver.evaluateAsync(expression);
    

    Inside of the afterPass of a gatherer should be able to get the html. HTML w/o Javascript does this. Modify the expression in that gatherer to get the idea of how it should work.

    Here's a really rough example of just logging the html once it's gathered from hacking on HTML w/o Javascript:

    Rough Screenshot