Search code examples
javahtmlunit

return all the HtmlPage's HTML


I want the entire HTML for a given HtmlPage object.

What property should I use?


Solution

  • In HtmlUnit, an HtmlPage implements the Page interface; that means that you can use Page#getWebResponse() to get the entire web response returned to generate the HtmlPage, and from there it's easy (WebResponse#getContentAsString()). Here's a method that does what you want...

    public String getRawPageText(WebClient client, String url)
            throws FailingHttpStatusCodeException, MalformedURLException, IOException {
        HtmlPage page = client.getPage(url);
        return page.getWebResponse().getContentAsString();
    }
    

    Or, using an HtmlPage object that you've already fetched:

    public String getRawPageText(HtmlPage page) {
        return page.getWebResponse().getContentAsString();
    }