Use Mechanize to retrieve ALL links of a website

How can I use the Mechanize library to find all the links on a website?

I'de like to parse the internal links recursively in order to grab all the links of a website.

Solution

Have you looked at the Anemone gem? It was specifically created for spidering websites.

You could do something like this to grab and print all the links of a website:

require 'anemone'

Anemone.crawl("http://www.example.com/") do |anemone|
  anemone.focus_crawl { |page| puts page.links }
end

It is fairly well documented with options to select if you want to spider the entire side, exclude certain types of links, or exclude links that are like something.

Strange bundle update issue: disappearing net-pop (0.1.2) dependency
How to create a Ruby DateTime from existing Time.zone for Rails?
Why use /app/lib instead of /lib in Rails?
Set raw value with Rails.cache.write
URI Extract escaping at colons, any way to avoid this?
RegEx problem or maybe another solution altogether?
How to disable a form submit button "a là Ruby on Rails Way"?
How to get url of Active Storage image
mini_racer gem 0.8.0 fails bundle install with Ruby 3.1.x
Why does Ruby have both private and protected methods?
Running bundle install on my Ruby on Rails application is generating errors on sqlite3 and nio4r
Accessing the app name from inside a rails template when generating rails app
Remove invalid bytes, keep valid UTF-8 (in Ruby 2)
diff a ruby string or array
How to embed text into cucumber report with ruby?
Running a method after the initialize method
bundle exec jekyll serve not working locally
Own configure block method not found in initializer
One liner in Ruby for displaying a prompt, getting input, and assigning to a variable?
Update just one gem with bundler
How do you use LIKE query for enum in rails?
How do I shuffle the order of an array in Jekyll?
Double vs single quotes
Execution expired error when uploading a large file to google cloud storage using google-api-client gem
Ruby travel time
Loading Iron Ruby DSL files on demand
Inheritance within iron workers when using the iron_worker_ruby gem
Specify custom location for Rubygems to install via package type
Overwriting instance methods via multiple includes
attr_accessor default values