[ Cross posting issue I posted on github to widen the audience ]
We are seeing a strange behavior when using the delay
method inside a transaction. It can easily be reproduced in a rails console with a User
model and the following code.
Does anyone know if delayed_job does something internally that changes the order of database operations as seen below?
# User model has first_name and last_name attributes
# users table has at least one test record
# The code below is for use in a rails console. Just copy, paste and run.
# a simple operation to queue up as a delayed job
def change_user_name(user_id)
u = User.find(user_id)
u.first_name = "Delaid"
u.save
end
User.transaction do
u2 = User.first
u2.last_name = "Willie"
u2.save
delay.change_user_name(u2.id)
u2.last_name = "Wonka"
u2.save
end
When this is run, we expect the insert into the delayed_jobs
table to happen in the order of operations shown, and within the transaction we have specified. This is confirmed by what we see in the rails console log, as shown below.
2015-02-03 09:17:55 -0600 (9606) : DEBUG : (0.3ms) BEGIN
2015-02-03 09:17:55 -0600 (9606) : DEBUG : User Load (0.7ms) SELECT `users`.* FROM `users` WHERE (user_type_id not in ('6')) LIMIT 1
2015-02-03 09:17:55 -0600 (9606) : DEBUG : SQL (0.4ms) INSERT INTO `history_logs` (`action`, `created_at`, `data`, `organization_id`, `updated_at`, `user_id`) VALUES ('update via on 50', '2015-02-03 15:17:55', '{\"last_name\":[\"van Tilburg\",\"Willie\"]}', NULL, '2015-02-03 15:17:55', 0)
2015-02-03 09:17:55 -0600 (9606) : DEBUG : (0.4ms) UPDATE `users` SET `last_name` = 'Willie', `updated_at` = '2015-02-03 15:17:55' WHERE `users`.`id` = 50
2015-02-03 09:17:55 -0600 (9606) : DEBUG : (0.2ms) BEGIN
2015-02-03 09:17:55 -0600 (9606) : DEBUG : SQL (0.6ms) INSERT INTO `delayed_jobs` (`attempts`, `created_at`, `failed_at`, `handler`, `last_error`, `locked_at`, `locked_by`, `priority`, `queue`, `run_at`, `updated_at`) VALUES (0, '2015-02-03 15:17:55', NULL, '--- !ruby/object:Delayed::PerformableMethod\nobject: !ruby/object\n prompt:\n :PROMPT_I: ! \'1.9.3p125 :%03n > \'\n :PROMPT_S: ! \'1.9.3p125 :%03n%l> \'\n :PROMPT_C: ! \'1.9.3p125 :%03n > \'\n :PROMPT_N: ! \'1.9.3p125 :%03n?> \'\n :RETURN: ! \" => %s \\n\"\n :AUTO_INDENT: true\nmethod_name: :change_user_name\nargs:\n- 50\n', NULL, NULL, NULL, 0, NULL, '2015-02-03 15:17:55', '2015-02-03 15:17:55')
2015-02-03 09:17:55 -0600 (9606) : DEBUG : (9.4ms) COMMIT
2015-02-03 09:17:55 -0600 (9606) : DEBUG : SQL (0.4ms) INSERT INTO `history_logs` (`action`, `created_at`, `data`, `organization_id`, `updated_at`, `user_id`) VALUES ('update via on 50', '2015-02-03 15:17:55', '{\"last_name\":[\"Willie\",\"Wonka\"]}', NULL, '2015-02-03 15:17:55', 0)
2015-02-03 09:17:55 -0600 (9606) : DEBUG : (0.4ms) UPDATE `users` SET `last_name` = 'Wonka', `updated_at` = '2015-02-03 15:17:55' WHERE `users`.`id` = 50
2015-02-03 09:17:55 -0600 (9606) : DEBUG : (3.5ms) COMMIT
=> true
However, mysql
binlogs show something different, like below. Note how the INSERT
into the delayed_jobs
table happens before the other operations and how it is placed inside its own transaction.
BEGIN
/*!*/;
# at 2278762
#150203 9:17:55 server id 1 end_log_pos 2278790 Intvar
SET INSERT_ID=24398/*!*/;
# at 2278790
#150203 9:17:55 server id 1 end_log_pos 2279486 Query thread_id=290135 exec_time=0 error_code=0
SET TIMESTAMP=1422976675/*!*/;
INSERT INTO `delayed_jobs` (`attempts`, `created_at`, `failed_at`, `handler`, `last_error`, `locked_at`, `locked_by`, `priority`, `queue`, `run_at`, `updated_at`) VALUES (0, '2015-02-03 15:17:55', NULL, '--- !ruby/object:Delayed::PerformableMethod\nobject: !ruby/object\n prompt:\n :PROMPT_I: ! \'1.9.3p125 :%03n > \'\n :PROMPT_S: ! \'1.9.3p125 :%03n%l> \'\n :PROMPT_C: ! \'1.9.3p125 :%03n > \'\n :PROMPT_N: ! \'1.9.3p125 :%03n?> \'\n :RETURN: ! \" => %s \\n\"\n :AUTO_INDENT: true\nmethod_name: :change_user_name\nargs:\n- 50\n', NULL, NULL, NULL, 0, NULL, '2015-02-03 15:17:55', '2015-02-03 15:17:55')
/*!*/;
# at 2279486
#150203 9:17:55 server id 1 end_log_pos 2279513 Xid = 129277545
COMMIT/*!*/;
BEGIN
/*!*/;
# at 2279592
#150203 9:17:55 server id 1 end_log_pos 2279620 Intvar
SET INSERT_ID=626645/*!*/;
# at 2279620
#150203 9:17:55 server id 1 end_log_pos 2279929 Query thread_id=290137 exec_time=0 error_code=0
SET TIMESTAMP=1422976675/*!*/;
INSERT INTO `history_logs` (`action`, `created_at`, `data`, `organization_id`, `updated_at`, `user_id`) VALUES ('update via on 50', '2015-02-03 15:17:55', '{\"last_name\":[\"van Tilburg\",\"Willie\"]}', NULL, '2015-02-03 15:17:55', 0)
/*!*/;
# at 2279929
#150203 9:17:55 server id 1 end_log_pos 2280106 Query thread_id=290137 exec_time=0 error_code=0
SET TIMESTAMP=1422976675/*!*/;
UPDATE `users` SET `last_name` = 'Willie', `updated_at` = '2015-02-03 15:17:55' WHERE `users`.`id` = 50
/*!*/;
# at 2280106
#150203 9:17:55 server id 1 end_log_pos 2280134 Intvar
SET INSERT_ID=626646/*!*/;
# at 2280134
#150203 9:17:55 server id 1 end_log_pos 2280437 Query thread_id=290137 exec_time=0 error_code=0
SET TIMESTAMP=1422976675/*!*/;
INSERT INTO `history_logs` (`action`, `created_at`, `data`, `organization_id`, `updated_at`, `user_id`) VALUES ('update via on 50', '2015-02-03 15:17:55', '{\"last_name\":[\"Willie\",\"Wonka\"]}', NULL, '2015-02-03 15:17:55', 0)
/*!*/;
# at 2280437
#150203 9:17:55 server id 1 end_log_pos 2280613 Query thread_id=290137 exec_time=0 error_code=0
SET TIMESTAMP=1422976675/*!*/;
UPDATE `users` SET `last_name` = 'Wonka', `updated_at` = '2015-02-03 15:17:55' WHERE `users`.`id` = 50
/*!*/;
# at 2280613
#150203 9:17:55 server id 1 end_log_pos 2280640 Xid = 129277538
COMMIT/*!*/;
We also tested the scenario without the outer transaction (like below) and the same behavior was seen. Meaning, the INSERT
into delayed_jobs
happen before the other code and out of order.
# This code without an outer transaction results in the same buggy behavior
u2 = User.first
u2.last_name = "Willie"
u2.save
delay.change_user_name(u2.id)
u2.last_name = "Wonka"
u2.save
Further, it does not matter if we use save
or save!
in the various AR operations in the above code. The issue happens regardless.
MySQL version
Variable_name Value
innodb_version 5.5.31
protocol_version 10
slave_type_conversions
version 5.5.31-0ubuntu0.12.04.1-log
version_comment (Ubuntu)
version_compile_machine x86_64
version_compile_os debian-linux-gnu
Rails and gems
rails-3.2.17
delayed_job-4.0.4
delayed_job_active_record-4.0.2
After a lot of printf debugging inside delayed_job and active_record code, I figured out the problem. We have been using a gem called db_charmer
to implement connection switching between primary (rw) and replica (ro) instance of our database and this gem under the covers implements connection switching logic. This was interfering with the transaction context logic in ActiveRecord.
The following monkey-patch fixed the issue.
I figured out the issue. We are using the [now defunct] db_charmer gem to do connection switching between primary and secondary database, and that was not taking effect on the delayed job model class. This causes us to use different connection instances for delayed_job vs other calls, and the transactions are treated separately.
The following monkey-patch fixed the issue.
class Delayed::Backend::ActiveRecord::Job
# use the same as what you use for all your other AR model classes
db_magic :connection => :primary, :slave => :replica, :force_slave_reads => false
end
The db_charmer
gem is no longer supported and I need to figure out an alternative :-( In the meantime, this patch should work.