Search code examples
ruby-on-railsherokudelayed-job

How to avoid meeting Heroku's API rate limit with delayed job and workless


My Survey model has about 2500 instances and I need to apply the set_state method to each instance twice. I need to apply it the second time only after every instance has had the method applied to it once. (The state of an instance can depend on the state of other instances.)

I'm using delayed_job to create delayed jobs and workless to automatically scale up/down my worker dynos as required.

The set_state method typically takes about a second to execute. So I've run the following at the heroku console:

2.times do
  Survey.all.each do |survey|
    survey.delay.set_state
    sleep(4)
  end
end

Shouldn't be any issues with overloading the API, right?

And yet I'm still seeing the following in my logs for each delayed job:

Heroku::API::Errors::ErrorWithResponse: Expected(200) <=> Actual(429 Unknown)

I'm not seeing any infinite loops -- it just returns this message as soon as I create the delayed job.

How can I avoid blowing Heroku's API rate limits?


Solution

  • Reviewing workless, it looks like it incurs an API call per delayed job to check the worker count and potentially a second API call to scale up/down. So if you are running 5000 (2500x2) jobs within a short period, you'll end up with 5000+ API calls. Which would be well in excess of the 1200/requests per hour limit. I've commented over there to hopefully help toward reducing the overall API usage (https://github.com/lostboy/workless/issues/33#issuecomment-20982433), but I think we can offer a more specific solution for you.

    In the mean time, especially if your workload is pretty predictable (like this). I'd recommend skipping workless and doing that portion yourself. ie it sounds like you already know WHEN the scaling would need to happen (scale up right before the loop above, scale down right after). If that is the case you could do something like this to emulate the behavior in workless:

    require 'heroku-api'
    heroku = Heroku::API.new(:api_key => ENV['HEROKU_API_KEY'])
    
    client.post_ps_scale(ENV['APP_NAME'], 'worker', Survey.count)
    2.times do
      Survey.all.each do |survey|
        survey.delay.set_state
        sleep(4)
      end
    end
    min_workers = ENV['WORKLESS_MIN_WORKERS'].present? ? ENV['WORKLESS_MIN_WORKERS'].to_i : 0
    client.post_ps_scale(ENV['APP_NAME'], 'worker', min_workers)
    

    Note that you'll need to remove workless from these jobs also. I didn't see a particular way to do this JUST for certain jobs though, so you might want to ask on that project if you need that. Also, if this needs to be 2 pass (the first time through needs to finish before the second), the 4 second sleep may in some cases be insufficient but that is a different can of worms.

    I hope that helps narrow in on what you needed, but I'm certainly happy to discuss further and/or elaborate on the above as needed. Thanks!