Search code examples
ruby-on-railsrubyasynchronoussidekiqdelayed-job

How to run batch job simultaneously with record updates?


I have a Rails app where I have to add millions of late fees to invoices which are due.

Currently I have a Que-backed job which runs once per day and selects batches of invoices using .find_in_batches(batch_size: 100). I tagged other common job libraries as I'm sure the same problem carries over. Note however Que is database-backed, not backed by Redis.

The problem is, at exactly this moment it's possible a customer will pay their invoice, and the late fee will get added to a paid invoice.

I'm not sure how to reconcile this difference in a performant way (or any way, to be honest).

I will be doing this for millions of rows, so the method must be fast.

What are different strategies for doing this?


Solution

  • I think you already outlined the contradiction. You want to do something que related in a performant way, and your queue is database backed.

    I've done something similar, and have had a great experience using Sidekiq. I would have the first job do .find_in_batches(...) and use that to schedule each overdue invoice as a separate job.

    When the job runs, it needs to recheck if the invoice is paid, before adding the fee.

    Using sidekiq for this, each sidekiq process will by default have 25 workers, which vastly improves performance. You can fine tune this and scale up the amount of sidekiq processes as needed. Scaling sidekiq is really a pleasure! Do note, that database connections might become a choke-point.