wanted to see what i was doing wrong. here.
I need to print the links on the parent page, even they are for another domain. And get out.
require 'anemone'
url = ARGV[0]
Anemone.crawl(url, :depth_limit => 1) do |anemone|
anemone.on_every_page do |page|
page.links.each do |link|
puts link
end
end
end
what am i not doing right?
Edit: Outputs nothing.
This worked for me
require 'anemone'
require 'optparse'
file = ARGV[0]
File.open(file).each do |url|
url = URI.parse(URI.encode(url.strip))
Anemone.crawl(url, :discard_page_bodies => true) do |anemone|
anemone.on_every_page do |page|
links = page.doc.xpath("//a/@href")
if (links != nil)
links.each do |link|
puts link.to_s
end
end
end
end
end