I have a table with an column user_id
, which is an integer index. There's a number of processes that have to pick non-conflicting users and do some processing. I am using modulo for this, so at some point, they will do (e.g. in the case of 4 threads):
UPDATE table SET data = x WHERE user_id % 4 = y;
Where y is 0 to 3, to mark which specific thread is working. MySQL can't use the user_id index like that unfortunately, so it seems to lock all the table to scan it. This causes deadlocks if it only has a READ lock, while another process is doing something like this:
DELETE FROM table WHERE user_id = z;
What do you think the best way to deal with this would be?
I am thinking I could have a separate SELECT user_id FROM table WHERE user_id % 4 = y
query, get the results (it will be a few hundred) and then do the update with a user_id IN (...)
instead.
I am not sure how locks work for nested selects, so could perhaps:
UPDATE table SET data = x WHERE user_id IN (SELECT user_id from table WHERE user_id % 4 = y)
also work, to avoid fetching the SELECT results to do the UPDATE?
Otherwise I guess I could use LOCK TABLES etc, but I'd think that's not nice, so I am curious whether there's something else to consider.
Create a temporary table containing the IDs you want. Then you can join with this and it should only lock those rows.
CREATE TEMPORARY TABLE temp_ids AS
SELECT id
FROM table
WHERE id % 4 = @y;
UPDATE table AS t
JOIN temp_ids as i ON t.id = i.id
SET t.data = @x;
I don't think it will work if you join with a subquery or CTE, because I suspect that will keep the table locked for the duration of the combined query.