I am using HtmlUnit to login to websites. Despite setting a connection timeout and Javascript timeout the script just hangs while attempting to log in to a site. This site is an internal page not open to the web.
Following is the webClient configuration that i am using:
RefreshHandler rh = new RefreshHandler()
{
public void handleRefresh( final Page page, final URL url, final int seconds )
{
}
};
webClient.setRefreshHandler(rh);
webClient.getOptions().setTimeout(90000); //Set Connection Timeout to 1.5 minute
webClient.setJavaScriptTimeout(45000); //Set JavaScript Timeout to 0.75 minute
webClient.getOptions().setCssEnabled(false);
webClient.getOptions().setJavaScriptEnabled(true);
webClient.setAjaxController(new NicelyResynchronizingAjaxController());
webClient.getCookieManager().setCookiesEnabled(true);
webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
webClient.getOptions().setThrowExceptionOnScriptError(false);
webClient.getOptions().setPrintContentOnFailingStatusCode(false);
webClient.getOptions().setRedirectEnabled(true);
System.setProperty("https.protocols", "SSLv3,SSLv2Hello");
NOTE: I am using IBM Jdk 1.7 and HtmlUnit 2.12 (the latest one). I have included all the 21 dependency jars in the buildpath of my project. It does not use any logging mechanism. It prints out everything on the console using println statements.
I am trying to figure out the following:
Why is it that the script hangs and does not timeout ? I have researched this issue in this forum. I know people have run into it but have not come accorss any concrete solution for it. There is nothing on sourceforge that indicates an "open" bug in HtmlUnit either.
Is there way to ensure the script never hangs? I thought setting the two timeouts above would have done the trick. What might be the other reasons that would make a script wait forever other than network/connection issues and unresponsive javascript?
I am aware that HtmlUnit uses Apache HttpClient to make the http calls.I want to debug this issue without building from source (I want to keep it as my last option as i am fairly new to java). Is there a way to run Htmlunit/HttpClient in the debug/verbose mode so that it prints out everything onto the console. Does the HtmUnit API support this?
I used all the 3 given below but none of it seemed to work:
System.getProperties().put("org.apache.commons.logging.simplelog.defaultlog", "debug");
java.util.logging.Logger.getLogger("com.gargoylesoftware").setLevel(Level.All);
java.util.logging.Logger.getLogger("org.apache.http").setLevel(Level.All);
First one was specified in the "logging" section on the HtmlUnit homepage.
I appreciate your assistance/comments. Thanks
As I don't quite know what the following line does, I will answer as it wasn't there:
System.setProperty("https.protocols", "SSLv3,SSLv2Hello");
You should first try to simplify your code as much as you can to get the minimal case (eg: you haven't clarified if your application hangs with javascript disabled.
Once you've done that you should take a close look at the HtmlPage you're fetching. Check what other object the page is fetching, particularly iframes. Then take a look at this question and the answer:
Extremely simple code not working in HtmlUnit
(Yes, that was me experiencing the same symptoms as you). However, I went a bit farther and used jstack to get a lower lever idea of the threads and what they were doing. In short (and as a spoiler), there was some kind of issue regarding an iframe load loop. The solution... well... you're not going to like it. Check the question and you'll find out :)
As a side note, try to a enable all logging, set the throwException*
flags to true
, remove any specific logging command (if you are setting .setLevel(Level.All)
and you are not getting anything something must be wrong... but as HtmlUnit provide quite a lot of logging by default you might not need to add more).
Just my 2 cents.