Search code examples
sqlpostgresqlconcurrencytransactionsdatabase-deadlocks

Deadlock when using SELECT FOR UPDATE with WHERE IN clause


Consider the following example:

-- Transaction 1 -> T1
BEGIN;
SELECT * FROM table1 WHERE id = 1 FOR UPDATE;
SELECT * FROM table1 WHERE id = 2 FOR UPDATE;
UPDATE table1 set col1 = 'abcd' where id = 1;
COMMIT;

-- Transaction 2 -> T2
BEGIN;
SELECT * FROM table1 WHERE id = 2 FOR UPDATE;
SELECT * FROM table1 WHERE id = 1 FOR UPDATE;
UPDATE table1 set col1 = 'defg' where id = 2;
COMMIT;

In this example it's obvious that a deadlock could happen if both transaction are executed concurrently because if T1 locks row with id=1 and then T2 locks row with id=2, both T1 and T2 can not perform the second SELECT FOR UPDATE query and we have a deadlock.

Now, to solve this, we could just perform the SELECT FOR UPDATE queries in the same order:

-- Transaction 1 -> T1
BEGIN;
SELECT * FROM table1 WHERE id = 1 FOR UPDATE;
SELECT * FROM table1 WHERE id = 2 FOR UPDATE;
UPDATE table1 set col1 = 'abcd' where id = 1;
COMMIT;

-- Transaction 2 -> T2
BEGIN;
SELECT * FROM table1 WHERE id = 1 FOR UPDATE;
SELECT * FROM table1 WHERE id = 2 FOR UPDATE;
UPDATE table1 set col1 = 'defg' where id = 2;
COMMIT;

We solved the deadlock problem for this example.

Now, my question is, if you consider the following similar example:

-- Transaction 1 -> T1
BEGIN;
SELECT * FROM table1 WHERE id IN (1, 2) FOR UPDATE;
UPDATE table1 set col1 = 'abcd' where id = 1;
COMMIT;

-- Transaction 2 -> T2
BEGIN;
SELECT * FROM table1 WHERE id IN (1, 2) FOR UPDATE;
UPDATE table1 set col1 = 'defg' where id = 2;
COMMIT;

My questions:

Is it possible to have a deadlock in the last example?
In other terms: will Postgres lock all rows that match a WHERE condition atomically at the same time?
If yes, can we also say that the WHERE clause order does not count? So that in T1 we could use:

SELECT * FROM table1 WHERE id IN (1, 2) FOR UPDATE;

While in T2 we could use:

SELECT * FROM table1 WHERE id IN (2, 1) FOR UPDATE;

Without risking to cause a deadlock?


Solution

  • The last example is susceptible to deadlocks.

    Locks are not taken "atomically" in a sense that they would happen virtually at the same time for the same transaction (or even the same statement). Locks are taken along the way one after the other, and released at the end of the transaction.

    The point is that the list of items in the IN clause does not necessarily mandate an order in which rows are locked. You need an ORDER BY clause to do that. Or separate statements like you already successfully tried.

    Separate statements are verbose and more expensive. So:

    BEGIN;
    SELECT FROM table1
    WHERE  id IN (1,2)
    ORDER  BY id             -- !
    FOR    UPDATE;
    
    UPDATE table1 set col1 = 'abcd' WHERE id = 1;
    COMMIT;
    

    As long as all writing access to the same table sticks to the same order reliably, there cannot be a deadlock (from this interaction).

    The manual:

    The best defense against deadlocks is generally to avoid them by being certain that all applications using a database acquire locks on multiple objects in a consistent order.

    Related:

    With data more or less physically sorted by id on disk, this can even improve performance.