Search code examples
ruby-on-railsruby-on-rails-3ruby-on-rails-3.1nokogirimechanize-ruby

Trying to find image url via xpath using Mechanize


I am trying to find the image xpath for the following page: http://www.spoonsisters.com/product/1032000/38710.html

I can view the image_url via my browser, however when I try finding it via Mechanize:

page = Agent.get("http://www.spoonsisters.com/product/1032000/38710.html")
page.parser.xpath('('//*[@id="main_image"]')')
 => [#<Nokogiri::XML::Element:0x80484c7c name="img" attributes=[#<Nokogiri::XML::Attr:0x80484bdc name="id" value="main_image">, #<Nokogiri::XML::Attr:0x80484bc8 name="src">, #<Nokogiri::XML::Attr:0x80484b8c name="alt" value="Paper Cocktail Napkins - What happens tonight goes on Facebook tomorrow">]>] 

I get 'src'> blank. How do I find the image_url?


Solution

  • It's because that image src is being set by javascript when the page loads. If you look at the source and search for "main_image", you'll see the following:

    <img id="main_image" src="" alt="Bar Towel - Wine Varietals" />
    

    Mechanize doesn't have the ability to run javascript so it will always be an empty string.