Search code examples
javaperformanceurltimeout

How to quickly test if a URL exists and has content in java?


I am looking to test to see if hundreds of URLs exist, and the current way I have takes too much time. This is what I have found so far:

public static boolean checkURL(URL u)
{
HttpURLConnection connection = null;
try
{
  connection = (HttpURLConnection) u.openConnection();
  connection.setRequestMethod("HEAD");
  int code = connection.getResponseCode();
  System.out.println("" + code);
  // You can determine on HTTP return code received. 200 is success.
  if (code == 200)
  {
    return true;
  }
  else
  {
    return false;
  }
}
catch (MalformedURLException e)
{
  // TODO Auto-generated catch block
  // e.printStackTrace();
  System.out.println("error");
}
catch (IOException e)
{
  System.out.println("error2");
  // TODO Auto-generated catch block
  // e.printStackTrace();
}
finally
{
  if (connection != null)
  {
    connection.disconnect();
}
}

return false;
}

Although this does successfully find whether a URL exists and has content, it does so in a lengthy period of time, with the program often taking upwards of five minutes to execute. Does anyone know more efficient ways to test this?

Note: It is important to test that not only the url returns 200, but also that the website doesn't timeout.


Solution

  • Your code looks good and it should be the easiest way to check for url. You might want to add a timeout in the HttpURLConnection.

    Sample code for reference.

    enter code here
    import java.net.HttpURLConnection;
    import java.net.URL;
    
    public class UrlChecker {
    public static void main(String[] args) {
    System.out.println(URLExists("http://slowwly.robertomurray.co.uk/delay/
    3000/url/http://www.google.co.uk"));
    }
    
    public static boolean URLExists(String targetUrl) {
        HttpURLConnection urlConnection;
        try {
            urlConnection = (HttpURLConnection) new 
            URL(targetUrl).openConnection();
            urlConnection.setRequestMethod("HEAD");
            // Set timeouts 2000 in milliseconds and throw exception
            urlConnection.setConnectTimeout(2000);
            urlConnection.setReadTimeout(2000);
           /* Set timeouts 4000 in milliseconds and it should work as the url 
            should return back in 3 seconds.
            httpUrlConn.setConnectTimeout(4000);
            httpUrlConn.setReadTimeout(4000);
            */
            System.out.println("Response Code =>"+ 
            urlConnection.getResponseCode());
            System.out.println("Response Msg =>"+ 
            urlConnection.getResponseMessage());
            return (urlConnection.getResponseCode() == 
            HttpURLConnection.HTTP_OK);
        } catch (Exception e) {
            System.out.println("Exception => " + e.getMessage());
            return false;
        }
    }
    }