I have a Sidekiq project running a task creates multiple child records and updates the parent record to which this child object belongs. The database is Postgres.
The schema is give below. When the Child record is created, there is a before_create method that updates a flag in the Parent. The Parent has a before_save method to update a timestamp.
# t.boolean "is_updated", default: false
# id :integer not null, primary key
#t.datetime "updated_at"
class Parent < ActiveRecord::Base
has_many :child
def update_flag
self.is_updated = true
end
before_save : set_updated_at
def set_updated_at
self.updated_at = Time.current
end
end
class ChildRecord < ActiveRecord::Base
belongs_to :parent
before_create :update_parent_flag
def update_parent_flag
if self.parent.try(:update_flag)
self.parent.save!
end
end
end
When the Sidekiq job creates just one Child record, there is no error. However when the job tries to create a larger bactch of Child records (35 in one example), the Job is in busy state for a long time. We can see that the Postgres connection is waiting for a lock in the Parent update.
Update The Sidekiq job updates or creates the batch of Child records in a single transaction.
def perform(params, options = {})
children = params[:children]
#New transaction
ActiveRecord::Base.transaction do
children.each do |child|
return_code, error = create_or_update_child_record(child)
if return_code != :created
Rails.logger.info error
return
end
end
end
The following is the blocking statement in the database.
UPDATE "Parent" SET "updated_at" = $1 WHERE "Parent"."id" = $2
How can I avoid this lock when creating multiple Child records? Is there a better design?
Inserting each Child one-by-one is going to be inefficient regardless. You need a way to bulk insert Child objects. This bulk insert method would update their Parents only once at the end.
I prefer activerecord-import to Rails' insert_all primarily because it can do model validations.
# Make the Child models, but do not insert them.
children = [Child.new(...), Child.new(...), ...]
# Validate and insert all the children in bulk
Child.import! children, validate: true
# Get their Parent's ids
parent_ids = children.map(&:parent_id).uniq
# Update their Parents
Parent
.where(id: parent_ids)
.update_all(
is_updated: true,
# update_all does not update updated_at
updated_at: Time. current
)
This executes one insert for the Childs and one update for their Parents.
Note: Consider if the is_updated flag is necessary. Can you rely on the updated_at timestamp alone?