I am trying to use HtmlUnit to scrape scores off the BBC Sports website http://www.bbc.co.uk/sport/football/live-scores
The page loads on Premier League, then there is a dropdown to select other leagues and then click the 'Update' button to update the page (presumably via ajax).
This code works fine to get the updated scores:
long startTime = System.currentTimeMillis();
String titleBar = getTitleBar(page);
HtmlOption option = ukGroupDropdown.getOptionByValue(competition);
ukGroupDropdown.setSelectedAttribute(option, true);
HtmlButton updateButton = (HtmlButton)page.getElementById("filter-nav-submit");
Thread.sleep(1000); // WHY???????
HtmlPage newPage = updateButton.click();
while(titleBar.equals(getTitleBar(newPage))) {
System.out.println("Took " + (System.currentTimeMillis() - startTime));
return getMatches(newPage);
But if I take out the Thread.sleep 'before' clicking on the update button, the 'newPage' is never updated. Why could this be? And is there a more robust way (like the titleBar loop that just gets the text from the title bar eg "Barclays Premier League" etc).
Maybe the line:
ukGroupDropdown.setSelectedAttribute(option, true);
Is performing an asynchronous (AJAX) call and the
line needs to wait for the former to finish.
For example, the button could be disabled but when selecting an item it might get enabled.