Search code examples
javascriptjavahtmlunit

Can't get correct source from HtmlUnit


I'm trying to get the this page's source code with htmlunit, but it seems that some javascript is not being executed (or maybe something else is happening). It only returns the page as on the "Loading..." stage who is displayed before the tables appear. Am I doing something wrong?

My Code:

[...] WebClient webClient = new WebClient(BrowserVersion.CHROME);
Page page = webClient.getPage(url);
WebResponse response = page.getWebResponse();
String content = response.getContentAsString();
System.out.println("HTML: " + content); [...]

Solution

  • page.getWebResponse() returns the response as received from the server without JavaScript modifications.

    You should use:

    page.asXml()
    

    or

    page.asText()
    

    For that page, HtmlUnit seems to through an error:

    Invalid JavaScript value of type com.gargoylesoftware.htmlunit.ScriptException