I made a handy little link expander using curl within my ruby (Sintra) app.
def curbexpand(link)
result = Curl::Easy.new(link)
begin
result.headers["User-Agent"] = "..."
result.verbose = true
result.follow_location = true
result.max_redirects = 3
result.connect_timeout = 5
result.perform
return result.last_effective_url # Returns the final destination URL after x redirects...
rescue
return link
puts "XXXXXXXXXXXXXXXXXXX Error parsing link XXXXXXXXXXXXXXXXXXXXXXXXXXX"
end
end
The problem I have is that some geniuses are using URL shorteners to link to .exe's and .dmg's which would be fine but it looks like my curl script above is waiting for the full response to be returned (i.e. it could be a 1GB file!) before returning the url. I don't want to use third party link expander API's as I have a significant volume of links to expand.
Anyone know how I can tweak curb to just find the url rather than waiting for the full response?
I've done what you want using using Net::HTTP
to process "HEAD" requests, and look for redirects that way. The advantage is a HEAD will not return content, only headers.
From the docs:
head(path, initheader = nil)
Gets only the header from path on the connected-to host. header is a Hash like { ‘Accept’ => ‘/’, … }.
This method returns a Net::HTTPResponse object.
This method never raises an exception.
response = nil
Net::HTTP.start('some.www.server', 80) {|http|
response = http.head('/index.html')
}
p response['content-type']
Combine that with the example in the Net::HTTP docs for following redirection, and you should be able to find your landing URL.
You can probably use Curl::http_head
to accomplish much the same thing.