Search code examples
ruby-on-railsdelayed-jobresquesidekiq

How to manage a pool of background servers in Rails


Our Rails application has some very intensive background processes, sometimes taking several hours to run. We are using delayed_job, and would consider moving to Resque or the free version of Sidekiq, it made sense in this context of this question.

We are hitting 100% cpu on all processors for some of the jobs, and currently the background processors are on the same physical server as Nginx, Rails and Postgres. We are also expecting the load to rise.

We would like to move the background processing off to a pool of commodity-level batch processing VMs, and preferably spin them up as needed. The way I am thinking is to extract the perform code into mini-apps and put them onto the batch processing VMs.

What I am not sure about is how to code this, also how to load-balance the job queues across different VMs. Is this something that delayed_job/Reqsue/Sidekiq can do, or do I need to code it?

EDIT

Some useful links I have found on this topic

http://www.slideshare.net/kigster/12step-program-for-scaling-web-applications-on-postgresql

Use multiple Redis servers in Sidekiq

https://stackoverflow.com/a/19540427/993592


Solution

  • My personal preference is Sidekiq. I'd be a little concerned about "several hour" jobs and what happens if they fail in the middle. By default Sidekiq will try and re-run them. You can change that, but you definitely want to think through the the scenario. This of course will be true for whatever background job processing system you use though. IMHO I'd try to find a way to break those big jobs up into smaller jobs. Even if it's just "job part 1 runs then enqueues job part 2, etc".

    As for scalability Sidekiq's only real limit is Redis. See here for some options on that: https://github.com/mperham/sidekiq/wiki/Sharding

    As for load balancing, Sidekiq does it by default. I run two sidekiq servers now that pull from a single Redis instance. 25 workers on each with about 12 queues. Works amazingly well.