Search code examples
javahtmlunit

How to get a HTML page using HtmlUnit


I know you may think this question is stupid, but I need to use HtmlUnit. However, it returns a page either as XML or as text.

I don't how to get the pure HTML (the same as the source code that browsers return)

I need this, because I need to use some written modules. Any ideas?


Solution

  • You can use the following piece of code to achieve your goal:

    WebClient webClient = new WebClient();
    Page page = webClient.getPage("http://example.com");
    WebResponse response = page.getWebResponse();
    String content = response.getContentAsString();
    

    See javadocs of the WebResponse.html#getContentAsString() method.