Search code examples
rubyhttpweb-services

Robustly call a flaky API: proper error handling with Net::HTTP


I hacked this together as a seemingly robust way to call a flaky webservice that was giving timeouts and the occasional name resolution or socket error or whatever. I thought I'd put it here in case it's useful or, more likely, to be told a better way to do this.

require 'net/http'

retries = 5
begin
  url = URI.parse('http://api.flakywebservice.com')
  http = Net::HTTP.new(url.host, url.port)
  http.read_timeout = 600  # be very patient
  res = nil
  http.start{|http|
    req = Net::HTTP::Post.new(url.path)
    req.set_form_data(params)  # send a hash of the POST parameters
    res = http.request(req)
  }
rescue Exception   # should really list all the possible http exceptions
  sleep 3
  retry if (retries -= 1) > 0
end

# finally, do something with res.body, like JSON.parse(res.body)

The heart of this question is: What all exceptions should I be looking for when making a call to a webservice like this? Here's an attempt to collect them all, but it seems like there's got to be a better way than that: http://tammersaleh.com/posts/rescuing-net-http-exceptions


Solution

  • Exceptions are meaningful, and Net::HTTP offers specific exceptions for different sorts of cases. So if you want to handle them each in a particular way, you can.

    That article says that handling those specific exceptions is better/safer than handling rescue Exception, and that's very true. BUT, rescue Exception is different from rescue by itself, which is the equivalent to rescue StandardError, which is what you should usually do by default if you don't have a reason to do anything else.

    Rescuing top-level Exception will rescue anything that could possibly happen in the entire execution stack, including some part of ruby running out of disk or memory or having some obscure system-related IO problem.

    So, as far as "what to rescue", you're generally better off if you change your code to rescue. You'll catch everything you want to, and nothing that you don't want to. ~However, in this particular case, there is one lone exception in that guy's list that is NOT a descendent of StandardError~ [2024 update: Timeout::Error now does descend from StandardError and has for several years]:

    def parents(obj)
      ( (obj.superclass ? parents(obj.superclass) : []) << obj)
    end
    
    [Timeout::Error, Errno::EINVAL, Errno::ECONNRESET, EOFError, Net::HTTPBadResponse,
      Net::HTTPHeaderSyntaxError, Net::ProtocolError].inject([]) do |a,c|
      parents(c).include?(StandardError) ? a : a << c
    end
    # => []
    

    So you could change your code to rescue StandardError => e and you'll cover all the cases mentioned in that article, and more, but not the stuff that you don't want to cover. (the => e is not required, but more on that below).

    Now, as far as your actual technique for dealing with the flakey API -- the question is, what's the problem with the API that you are dealing with? Badly formatted responses? No responses? Is the problem at the HTTP level or in the data you are getting back?

    Maybe you don't yet know, or you don't yet care, but you know that retrying tends to get the job done. In that case, I would at least recommend logging the exceptions. Hoptoad has a free plan, and has some sort of thing like Hoptoad.notify(e) -- I can't remember if that's the exact invocation. Or you can email it or log it, using e.message and e.stacktrace.