Search code examples
postgresqlconcurrencytransactionsselect-for-update

PostgreSQL SELECT ... FOR UPDATE: What happens with concurrent long running queries?


I was wondering what happens when two transactions execute a SELECT ... FOR UPDATE query in parallel. The background is that I want to implement a job queue using SELECT ... FOR UPDATE with SKIP LOCKED, as shown here: https://vladmihalcea.com/database-job-queue-skip-locked/. But in this article the queries are quite trivial.

An example with two transactions T1 and T2 (transaction isolation level is set to READ_COMMITTED):

  1. T1 starts
  2. T1 executes SELECT ... FOR UPDATE, searching for NEW rows, which requires some time.
  3. T2 starts
  4. T2 executes SELECT ... FOR UPDATE with same WHERE clause and parameters as T1, which also takes some time.
  5. T1 finally finds all rows, locks them
  6. T1 starts updating the rows (e.g. by marking them as now being IN_PROGRESS)
  7. T2 finally finds rows => what happens now?

Some questions:

  1. I would assume that T1 locks the rows in an atomic operation. Is this correct?
  2. So when T2 finally finds it's result set and tries to lock the rows it cannot do it? How does T2 react in this case? My assumption would be that it waits until T1 releases the locks (when not using NOWAIT).
  3. What if T2 finishes the query before T1 makes any changes (e.g. changing a job status from NEW to IN_PROGRESS)? Can both transaction find the same result set?
  4. If T1 somehow marks the locked rows (e.g. by changing a status column from NEW to IN_PROGRESS) and T2 looks for the original status (NEW), would T2 then, with SKIP LOCKED, skip the marked rows? Will T2 re-evaluate it's result set after the changes made by T1?

Solution

    1. Each transaction locks the rows as it finds them, so locking is not atomic. It could happen that T1 locks a couple of rows and T2 locks some other rows.

    2. Since each transaction locks rows immediately when it finds them, this cannot happen. Either a row is locked, in which case it is skipped, or it is not locked, in which case it is locked.

    3. If T1 commits before T2 is done scanning the table, T2 will happily lock all rows that were already processed by T1.

    4. Yes, that will work. T2 will fetch the most current version of each row before it checks the condition and locks the row.