I am trying to parse a list of image URL's and get some basic information before I actually commit to download.
My script will check a large list every day (about 1300 rows) and each row has 30-40 image URLs. My @photo_urls variable allows me to keep track of what I have downloaded already. I would really like to be able to use that later as a hash (instead of an array in my example code) to interate through later and do the actual downloading.
Right now my problem (besides being a Ruby newbie) is that Net::HTTP::Pipeline only accepts an array of Net::HTTPRequest objects. The documentation for net-http-pipeline indicates that response objects will come back in the same order as the corresponding request objects that went in. The problem is that I have no way to correlate the request to the response other than that order. However, I don't know how to get relative ordinal position inside a block. I assume I could just have a counter variable but how would I access a hash by ordinal position?
Net::HTTP.start uri.host do |http|
# Init HTTP requests hash
requests = {}
photo_urls.each do |photo_url|
# make sure we don't process the same image again.
hashed = Digest::SHA1.hexdigest(photo_url)
next if @photo_urls.include? hashed
@photo_urls << hashed
# change user agent and store in hash
my_uri = URI.parse(photo_url)
request = Net::HTTP::Head.new(my_uri.path)
request.initialize_http_header({"User-Agent" => "My Downloader"})
requests[hashed] = request
# process requests (send array of values - ie. requests) in a pipeline.
http.pipeline requests.values do |response|
if response.code=="200"
# anyway to reference the hash here so I can decide whether
# I want to do anything later?
Finally, if there is an easier way of doing this, please feel free to offer any suggestions.
Make requests an array instead of a hash and pop off the requests as the responses come in:
Net::HTTP.start uri.host do |http|
# Init HTTP requests array
requests = []
photo_urls.each do |photo_url|
# make sure we don't process the same image again.
hashed = Digest::SHA1.hexdigest(photo_url)
next if @photo_urls.include? hashed
@photo_urls << hashed
# change user agent and store in hash
my_uri = URI.parse(photo_url)
request = Net::HTTP::Head.new(my_uri.path)
request.initialize_http_header({"User-Agent" => "My Downloader"})
requests << request
# process requests (send array of values - ie. requests) in a pipeline.
http.pipeline requests.dup do |response|
request = requests.shift
if response.code=="200"
# Do whatever checking with request