Search code examples
htmlrubyparsingnokogiri

How do I parse Google search results with Nokogiri?


I need help pulling URLs from Google search results and was told to use Nokogiri. I installed it and read over the Nokogiri docs, but have no idea where to start -- it's all Greek to me.

I know what I am looking for is the URL of each result, each existing between a <cite> tag. So far all I was able to figure out how to do is pull the search results but I just don't know how to go about pulling specific data from the file. Here is the teeny-tiny bit of code I do have:

serp = Nokogiri::HTML(open("http://www.google.com/search?num=100&q=stackoverflow"))

Solution

  • enjoy :)

    require 'open-uri'
    require 'nokogiri'
    
    page = open "http://www.google.com/search?num=100&q=stackoverflow"
    html = Nokogiri::HTML page
    
    html.search("cite").each do |cite|
      puts cite.inner_text
    end
    

    also look at nokogiri tutorials