Search code examples
rubyruby-on-rails-4sidekiq

Finish sidekiq queues much quicker


I reached a point now, where is taking to long for a queue to finish, because new jobs are added to that queue. What are the best options to overcome this problem.

I already use 50 processors, but I noticed that if I open more, it will take longer for jobs to finish.

My setup: nginx, unicorn, ruby-on-rails 4, postgresql

Thank you


Solution

  • You need to measure where you are constrained by resources.

    If you're seeing things slow down as you add more workers you're likely blocked by your database server. Have you upgraded your Redis server to handle this amount of load? Where are you storing the scraped data to? Can that system handle the increased write load?

    If you were blocked on CPU or I/O, you should see the amount of work through the system scale linearly as you add more workers. Since you're seeing things slow down when you scale out, you should measure where your problem is. I'd recommend instrumenting NewRelic for your worker processes and measuring where the time is being spent.

    My guess would be that your Redis instance can't handle the load to manage the work queue with 50 worker processes.

    EDIT Based on your comment, it sounds like you're entirely I/O Bound doing web scraping. In that case, you should be increasing the concurrency option for each Sidekiq worker using the -c option to spawn more threads. Having more threads will allow you to continue processing scraping jobs even when scrapers are blocked on network I/O.