Search code examples
javahtmlunit

How to handle Too much redirect with HtmlUnit


I am trying to parse a site, but I encountered a Too much redirect exception. Here is my code:

WebClient client = new WebClient(BrowserVersion.FIREFOX_24);
HtmlPage homePage = null;
String url = "http://www.freelake.org/pages/Freetown-Lakeville_RSD/Departments/Director_of_Financial_Operatio";
try {
    client.getOptions().setUseInsecureSSL(true);
    client.setAjaxController(new NicelyResynchronizingAjaxController());
    client.getOptions().setThrowExceptionOnFailingStatusCode(false);
    client.getOptions().setThrowExceptionOnScriptError(false);
    client.waitForBackgroundJavaScript(30000);
    client.waitForBackgroundJavaScriptStartingBefore(30000);
    client.getOptions().setCssEnabled(false);
    client.getOptions().setJavaScriptEnabled(true);
    client.getOptions().setRedirectEnabled(true);
    homePage = client.getPage(url);
    synchronized (homePage) {
        homePage.wait(25000);
    }
    System.out.println(homePage.asXml());
} catch (Exception e) {
    e.printStackTrace();
}        

Exception are mention below

com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException: Too much redirect for http://www.freelake.org/resolver/2345183424.20480.0000/route.00/pages/Freetown-Lakeville_RSD/Departments/Director_of_Financial_Operatio
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1353)
at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1371)

Is there any way to solve this issue?


Solution

  • This is because HtmlUnit caches the response, and there is redirection to another page then returning back.

    I tested with the below, and it works:

    client.getCache().setMaxSize(0);