Search code examples
rubynokogirimechanize-ruby

Getting a specified number results when parsing with mechanize


I am parsing CNN.com to get the top five news storeis with their first paragraph. I have the following code.

url = "http://edition.cnn.com/?refresh=1"
agent = Mechanize.new
page = agent.get("http://edition.cnn.com/?refresh=1")
page.search("//div[@id='cnn_maintt2bul']/div/div/ul/li[count(*)=3]/a").map{|a|  page.uri.merge a[:href]}.each do |uri| 
 article = agent.get(uri).parser
 puts article.css(".adtag15090+ p").text
 puts "\n"
end

It's not perfect but it works, however, it retrieves all the articles yet I want to retrieve only five articles. Is there a way perhaps using ranges to limit the number of results to five?


Solution

  • The simple way to do it is to add an array slice after search. Nokogiri returns a NodeSet from a search, and NodeSet supports []:

    page.search("//div[@id='cnn_maintt2bul']/div/div/ul/li[count(*)=3]/a")[0, 5]...