Search code examples
rubyanemone

anemone print links on first page


wanted to see what i was doing wrong. here.

I need to print the links on the parent page, even they are for another domain. And get out.

require 'anemone'
url = ARGV[0]
Anemone.crawl(url, :depth_limit => 1) do |anemone|
    anemone.on_every_page do |page|
        page.links.each do |link|
            puts link
        end
     end
end

what am i not doing right?

Edit: Outputs nothing.


Solution

  • This worked for me

     require 'anemone'
        require 'optparse'
        file = ARGV[0]
        File.open(file).each do |url|
          url = URI.parse(URI.encode(url.strip))
          Anemone.crawl(url, :discard_page_bodies => true) do |anemone|
                anemone.on_every_page do |page|
                        links = page.doc.xpath("//a/@href")
                        if (links != nil)
                                links.each do |link|
                                        puts link.to_s
                                end
                        end
                end
    
          end
        end