Search code examples
javahtmlunit

ElementNotFoundException in HtmlUnit, although the element exists


I've made a java server that scrapes a website, but my problem is that after a few requests (about 10 or so) I always get this error ElementNotFoundException, although the element should be there. Basically my program just checks every few minutes this website for info but after a few times it just gives me that exception. This is my code for scraping, I don't know what's wrong with it that after a few times the element is not found..

final WebClient webClient = new WebClient();
try (final WebClient webClient1 = new WebClient()) {
    final HtmlPage page = webClient.getPage("http://b7rabin.iscool.co.il/מערכתשעות/tabid/217/language/he-IL/Default.aspx");

    WebResponse webResponse = page.getWebResponse();
    String content = webResponse.getContentAsString();
     //   System.out.println(content);


    HtmlSelect select = (HtmlSelect) page.getElementById("dnn_ctr914_TimeTableView_ClassesList");
    HtmlOption option = select.getOptionByValue("" + userClass);

    select.setSelectedAttribute(option, true);

    //String jscmnd = "javascript:__doPostBack('dnn$ctr914$TimeTableView$btnChangesTable','')";
    String jscmnd = "__doPostBack('dnn$ctr914$TimeTableView$btnChanges','')";

    ScriptResult result = page.executeJavaScript(jscmnd);

    HtmlPage page1 = (HtmlPage) result.getNewPage();

    String content1 = page1.getWebResponse().getContentAsString();
    //System.out.println(content1);
    System.out.println("-----");
    HtmlDivision getChanges = null;
    String changes = "";

    getChanges = page1.getHtmlElementById("dnn_ctr914_TimeTableView_PlaceHolder");   
    changes = getChanges.asText();
    changes = changes.replaceAll("\n", "").replaceAll("\r", "");

    System.out.println(changes);
}

The exception:

Exception in thread "Thread-0" com.gargoylesoftware.htmlunit.ElementNotFoundException: elementName=[*] attributeName=[id] attributeValue=[dnn_ctr914_TimeTableView_PlaceHolder]
at com.gargoylesoftware.htmlunit.html.HtmlPage.getHtmlElementById(HtmlPage.java:1552)
at scrapper$1.run(scrapper.java:108)

I am really desperate to solve it, it's the only bottleneck in my project.


Solution

  • You just need to wait a little before manipulating the second page, as hinted here.

    So, sleep() for 3 seconds would make it always succeeds.

    HtmlPage page1 = (HtmlPage) result.getNewPage();
    
    Thread.sleep(3_000); // sleep for 3 seconds
    
    String content1 = page1.getWebResponse().getContentAsString();
    

    Also, you don't need to instantiate two instances of WebClient.