Search code examples
ruby-on-railsrubyredissidekiq

how to import millions of users without compromise sidekiq


I have a rake task like this

task import_all: :environment do
  Person.find_each do |p|
    UserWorker.perform_async(p.to_global_id)
  end
end

each user creates around 20 new jobs on Sidekiq and these 20 jobs also create more 4 or 5 new jobs.
Turns out I have 980k users and my redis is using almost 70% of CPU usage, enqueing more than 7 million jobs!
How can I import all these users without compromise CPU/Mem?


Solution

  • You're spawning a background worker for every Person, why not spawn one?

    # some rakefile I'm guessing?
    task import_all: :environment do
      UserWorker.perform_async
    end
    
    # UserWorker.rb
    class UserWorker
      include Sidekiq::Worker
      def perform
        Person.find_each do |p|
          # whatever you were doing in here before...
        end
      end
    end
    

    Unless you have a massively parallel backend, there's not much advantage in breaking the work up into single workers. You'll want to handle/log errors in here as appropriate and possibly re-try those individually but the general solution shouldn't need such fine-grained workers.