Search code examples
ruby-on-railsherokumongoiddelayed-job

Delayed Job forgets about jobs that have been sitting on the queue for several minutes and have no attempts


I'm using delayed_job to create large numbers of jobs, nearly all at one time, to be done at a later time, if the number of jobs gets too high, after a certain amount of time, every job is cleared from the queue regardless of it's state.

the following rails project illustrates this issue: https://github.com/hayksaakian/taskbreaker

to recreate the issue, create several tasks (say 5 to 15), each with around 100 goals (from the web interface, or console)

then in console attempt to do these tasks with: Task.attempt_tasks

What will happen is the following: Many jobs will be created, the workers do their thing for several minutes, then poof every job disappears from the queue.

To verify this is the case, check any task, you'll notice that each accomplishment may not have an arbitrary_number equal to '10' (it should be 10 since we're incrementing by one for each of the delayed attempt_string calls). The arbitrary_array of each accomplishment is also not of length 10 (which it should be, given that we delayed 10 calls to attempt_array);

I'm not sure why this is happening as I'm seeing no errors, but i'm sure that it's happening.

see an example of the bad work at taskbreaker.herokuapp.com

Note i'm hosting on heroku if that's of any help. Also you'll need at least 5 workers to recreate the issue in any reasonable amount of time.


Solution

  • This was due to a race condition. Since the methods were happening concurrently, their reads and writes were overlapping, resulting in unexpected output.