I have around 10 workers that performs a job that includes the following:
user = User.find_or_initialize_by(email: '[email protected]')
if user.new_record?
# ... some code here that does something taking around 5 seconds or so
elsif user.persisted?
# ... some code here that does something taking around 5 seconds or so
end
user.save
The problem is that at certain times, two or more workers run this code at the exact time, and thus I later found out that two or more Users have the same email
, in which I should always end up only unique emails.
It is not possible for my situation to create DB Unique Indexes for email
as unique emails are conditional -- some Users should have unique email, some do not.
It is noteworthy to mention that my User
model has uniqueness validations, but it still doesn't help me because, between .find_or_initialize_by
and .save
, there is a code that is dependent if the user object is already created or not.
I tried Pessimistic and Optimistic locking, but it didn't help me, or maybe I just didn't implement it properly... should you have some suggestions regarding this.
The solution I can only think of is to lock the other threads (Sidekiq jobs) whenever these lines of codes get executed, but I am not too sure how to implement this nor do I know if this is even a suggestable approach.
I would appreciate any help.
EDIT
In my specific case, it is gonna be hard to put email parameter in the job, as this job is a little more complex than what was just said above. The job is actually an export script in which a section of the job is the code above. I don't think it's also possible to separate the functionality above into another separate worker... as the whole job flow should be serial and that no parts should be processed parallely / asynchronously. This job is just one of the jobs that are managed by another job, in which ultimately is managed by the master job.
I managed to solve my problem with the following:
I found out that I can actually add a where
clause in Rails DB Uniqueness Partial Index, and thus I can now set up uniqueness conditions for different types of Users on the database-level in which other concurrent jobs will now raise an ActiveRecord::RecordNotUnique
error if already created.
The only problem now then is the code in between .find_or_initialize_by
and .save
, since those are time-dependent on the User objects in which always only one concurrent job should always get a .new_record? == true
, and other concurrent jobs should then trigger the .persisted? == true
as one job would always be first to create it, but... all of these doesn't work yet because it is only at the line .save
where the db uniqueness index validation gets called. Therefore, I managed to solve this problem by putting .save
before those conditions, and at the same time I added a rescue block for .save
which then adds another job to the queue of itself should it trigger the ActiveRecord::RecordNotUnique
error, to make sure that async jobs won't get conflicts. The code now looks like below.
user = User.find_or_initialize_by(email: '[email protected]')
begin
user.save
is_new_record = user.new_record?
is_persisted = user.persisted?
rescue ActiveRecord::RecordNotUnique => exception
MyJob.perform_later(params_hash)
end
if is_new_record
# do something if not yet created
elsif is_persisted
# do something if already created
end