Search code examples
rubydistributeddistributed-transactionsdaemons

How can I distribute a task between many process in Ruby?


I have a ruby daemon that selects 100 records from database and do a task with it.

To make it faster I usually create 3 instances of the same daemon. And each one selects diferents data by using mysql LIMIT and OFFSET.

The problem is that sometimes a task is performed 2 or 3 times with the same data record.

So I think that trusting only on database LIMIT and OFFSET is not enough ... since 2 or more daemons can actually collects the same data at the same time sometimes.

How can I do it safely? Avoiding 2 instances to select the same data

  • Daemon 1 => selects records from 1 to 100
  • Daemon 2 => selects records from 101 to 200
  • Daemon 3 => selects records from 201 to 300

Solution

  • Rather than rolling your own solution, you might want to look at existing solutions for processing background jobs like Resque (a personal favorite). With Resque, you would queue a job for each of your rows using a trigger that makes sense in your application (it's hard to say without any context) for example a link on your website. At all times you would keep X number of workers running (three in your case) and Resque will do the queue management work for you. Resque uses Redis as a backend, so it supports atomic push/pop out of the gate (no more double-processing).

    Resque also comes with a very intuitive and easy to use web interface for monitoring your jobs and workers.