Search code examples
rubymechanizemechanize-ruby

Ruby Mechanize scraping ResponseCodeError


I'm trying to scrape all results page of a website, it works but sometimes the script stops and display this error:

502 => Net::HTTPBadGateway for https://website.com/id/12/ -- unhandled response (Mechanize::ResponseCodeError)

I would like to continue the script even if it found an error.

My script:

require 'mechanize'
require 'csv'

a = Mechanize.new
CSV.open('datas.csv', "wb") do |csv|
    page = a.get("https://website.com/?page=1-200") #498
    number = 0
    page.links_with(:class => "btn btn-default").each do |link|
        post_link = link.href
        inside_page = a.get("https://website.com#{post_link}")
        title = inside_page.at("h1.serviceTitle").text.strip
        author = inside_page.at(".name").text.strip
        number+=1
        csv << [title, author]
    end
end

Any idea ?


Solution

  • this is easily solved by proper exception handling. you can check this page for a better explanation

    as for your code, you can handle the exception like so

    CSV.open('datas.csv', "wb") do |csv|
      begin
        a = Mechanize.new
        page = a.get("https://website.com/?page=1-200") #498
        number = 0
        page.links_with(:class => "btn btn-default").each do |link|
            post_link = link.href
            inside_page = a.get("https://website.com#{post_link}")
            title = inside_page.at("h1.serviceTitle").text.strip
            author = inside_page.at(".name").text.strip
            number+=1
            csv << [title, author]
        end
      rescue => e
        // do nothing and move on to the next line
      end
    end