Search code examples
ruby-on-railsscopemulti-tenantdelayed-job

delayed_job: One job per tenant at a time?


I have a multitenant-Rails app with multiple delayed_job workers.

In order to avoid overlapping tenant-specific work, I would like to separate the workers from each other in such a way that each one works on only one tenant-specific task at a time.

I thought about using the (named) queue column and add "tenant_1", "tenant_2" and so on. Unfortunately the queues have to be named during configuration, so this principle is not flexible enough for many tenants.

Is there a way to customize the way delayed_job picks the next task? Is there another way to define a scope?


Solution

  • Your best bet is probably to spin a custom solution that implements a distributed lock - essentially, the workers all run normally and pull from the usual queues, but before performing work check with another system (Redis, RDBMS, API, whatever) to verify that no other worker is yet performing a job for that tenant. If that tenant is not being worked, then set the lock for the tenant in question and work the job. If the tenant is locked, don't perform the work. It's your call on a lot of the implementation details like whether to move on to try another job, re-enqueue the job at the back of the queue, whether to consider it a failure and bind it to your retry limits, or do something else entirely. This is pretty open-ended, so I'll leave the details to you, but here are some tips:

    • Inheritance will be your friend; define this behavior on a base job and inherit from it on the jobs you expect your workers to run. This also allows you to customize the behavior if you have "special" cases for certain jobs that come up without breaking everything else.
    • Assuming you're not running through ActiveJob (since it wasn't mentioned), read up on delayed_job hooks: https://github.com/collectiveidea/delayed_job/#hooks - they may be an appropriate and/or useful tool
    • Get familiar with some of the differences and tradeoffs in Pessimistic and Optimistic locking strategies - this answer is a good starting point: Optimistic vs. Pessimistic locking
    • Read up on general practices surrounding the concept of distributed locks so you can choose the best tools and strategies for yourself (it doesn't have to be a crazy complicated solution, a simple table in the database that stores the tenant identifier is sufficient, but you'll want to consider the failure cases - how to you manage locks that are abandoned, for example)

    Seriously consider not doing this; is it really strictly required for the system to operate properly? If so, it's probably indicative in an underlying flaw in your data model or how you've structured transformations around that data. Strive for ACIDity in your application when thinking about operations on the data and you can avoid a lot of these problems. There's a reason it's not a commonly available "out of the box" feature on background job runners. If there is an underlying flaw, it won't just bite you on this problem but on something else - guaranteed!