Search code examples
ruby-on-railspostgresqlruby-on-rails-4synchronizedsidekiq

Block/Re-queue other sidekiq jobs from processing when existing sidekiq job is processing a particular resource


I have sidekiq jobs doing processing on a many types of resources. However, for a particular type of resource, eg: Resource X, I need to ensure that only 1 sidekiq job can process that particular resource at any given time.

For example, if i have 3 sidekiq jobs that gets queued simultaneously and want to interact with resource X, then only 1 sidekiq job can process resource X while the 2 remaining sidekiq jobs will have to wait (or be re-queued) until the sidekiq job that is currently processing the resource finishes.

Currently, i am trying to add a record in a database table for when a sidekiq job is processing the resource and use that to stop other sidekiq jobs from processing the resource until that record is deleted from the database by the sidekiq job that added it (when it finishes processing resource X) or after a certain elapsed time has passed (eg: If the record was created more than 5 minutes ago, then it is considered to no longer hold exclusive access to resource X and the next sidekiq job that wants to process resource X may alter that record and claim exclusive access to resource X).

A pseudocode of my current implementation:

def perform(res_id, res_type)

  # Only applies to "RESOURCE_X" 
  if res_type == RESOURCE_X
    if ResourceProcessor.where(res_id).empty? || ((Time.now-ResourceProcessor.where(res_id).first.created_at) > 5.minutes)
      ResourceProcessor.create(res_id: res_id).save
      process_resource_x(res_id)
    else
      SidekiqWorker.delayed(res_id, res_type, 5.minutes) #Try again later
      return
    end

    #Letting other sidekiq jobs know they can now fight over who gets to process resource X
    ResourceProcessor.where(res_id).destroy 

  else
    process_other_resource(res_id)
  end

end

Unfortunately, my solution does not work. It works just fine if there is a delay between sidekiq jobs that wants to process resource X. However, if the jobs that want to process resource X arrives simultaneously, then my solution falls apart.

Is there any way i can enforce some sort of synchronization only when processing resource X?

Btw, my sidekiq jobs may be distributed across several machines (but they access the same redis server on a dedicated machine).


Solution

  • I did more research based on the comment provided by Thomas.

    The link he provided was extremely useful. They implemented their own custom Lock class to achieve the results they want. However, i did not use their custom lock code because I needed a different behaviour.

    The specific behaviour i was looking to implement is "Re-queue if locked" and not "Wait if lock".

    There are alternative tools that I could have used, such as redis-semaphore and with_advisory_gem. I tested redis-semaphore and found it buggy. It wasnt returning the lock state and resource count correctly. Also, after checking the issues on Github, in some situations, redis-semaphore may get itself into its own deadlock, so i decided to abandon using it. As a result, i also decided not to use the with_advisory_gem due to its lower star count than redis-semaphore.

    In the end I found a way to implement the locking pattern i described in my question, which is to block sidekiq jobs based on a value in my database. I dealt with the concurrency issue of multiple sidekiq jobs reading stale values by locking the entire database row with rail's very own Locking-pessimistic class. This ensured that only 1 sidekiq worker can access the database row which holds the locking value at any given time. Locking period is kept to a minimal because only a read and when applicable, a write operation is performed while locking the database row. Subsequent operations such as doing a requeue and cleaning up is done after.