Search code examples
ruby-on-railsnet-http

Checking existence of URL in Rails using Ruby's Net::HTTP class, not working for certain URLs that exist


So I wanted to implement a function that basically checks to see if a URL that is entered in a search box exists. My app takes the URL and uses the Net::HTTP class to send a request to the URL and check to see if the HTTP status code is 200.

The code inside my function that checks to see if the URL exists basically looks like this:

url = URI.parse(input_url)
req = Net::HTTP.new(url.host, url.port)
res = req.request_head(url.path)

I then check if res.code is 200, and if it isn't then my app assumes the URL doesn't exist and handles the error.

I'm working with the URLs of classified ads and it works just fine for ads on craigslist, for example http://newyork.craigslist.org/brk/abo/5449483116.html gives me a 200 status code.

However for any ad on kijiji (or ebay classifieds both run by ebay) that exist such as: http://www.kijiji.ca/v-hand-tool/city-of-toronto/auger-style-flex-installer-bit-for-wood/1129757133 it keeps giving me a 500 error code, meaning there is an internal server error.

Does anyone have any suggestions of what is going wrong?


Solution

  • It could just be that the server you're trying doesn't explicitly handle HEAD requests. Doing a regular GET should work in most cases. For example

    # url = "http://www.kijiji.ca/v-hand-tool/city-of-toronto/auger-style-flex-installer-bit-for-wood/1129757133"
    url = URI.parse(input_url)
    req = Net::HTTP.new(url.host, url.port)
    res = req.request_get(url.path)
    

    This will return

    #<Net::HTTPOK 200 OK readbody=true>