How using HtmlUnit I can extract page which contains javascript as HTML? I found sample code as below but not working.
public class Downloader {
public static void main(String[] args) throws Exception {
LogFactory.getFactory().setAttribute("org.apache.commons.logging.Log", "org.apache.commons.logging.impl.NoOpLog");
java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit").setLevel(Level.OFF);
java.util.logging.Logger.getLogger("org.apache.commons.httpclient").setLevel(Level.OFF);
try (final WebClient webClient = new WebClient()) {
final HtmlPage page = webClient.getPage("https://www.oddsportal.com/matches/soccer/");
System.out.println(page.asText());
}
System.out.println("END");
}
}
With this code I landing in infinite loop. I don't know why. If I open above site in firefox inspector I can see full HTML code after executing javascript. How I can reach the same result with HtmlUnit. It is possible? Maybe I should using any other library? Any suggestions?
HtmlUnit tends to have a lot of problems with interpreting javascript. If you are just looking for the game data, you might be more successful otherwise: https://github.com/gingeleski/odds-portal-scraper
Anyways, i managed to get the code working with changing the BrowserVersion:
final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_60)