`select for update` does not return updated row when another process updated a foreign key in that row and select joins two tables

According to the PSQL docs on explicit locking:

SELECT FOR UPDATE will wait for a concurrent transaction that has run any of those commands on the same row, and will then lock and return the updated row (or no row, if the row was deleted).

However, it seems that it is not return the updated row when I update a foreign key in that row and I use join with other table in my select statement. For other columns it works as expected. It seems that when a foreign key column is updated in the concurrent transaction, the updated row is not visible and it behaves like the row was removed.

Below is my short example. I expect that both (concurrent) transactions update the row. However, only one of them does and the second returns nothing.

But when the update statement in first transaction is changed to update of non-FK column, both transactions can select the row, as expected according to the docs.

Note that select has join to foo table even though that table is not updated, but I actually need some columns from foo, this is just a minimal example. Without join it works as expected, even when updating fk. So there are two important conditions for this behavior - 1. update a FK column, 2. use JOIN in select statement.

create table foo(pk int primary key);
create table bar(pk int, fk int references foo(pk), value int);
insert into foo values (1), (2), (3);
insert into bar values (1, 1, 1);

---- First transaction
begin;
select * from bar join foo on foo.pk = bar.fk where bar.pk = 1 for update of bar;
-- Now run select in the second transaction concurrently, see code below and then get back here.

update bar set fk = 2;
-- Update value column instead of fk and the second transaction will return row as expected.
-- update bar set value = 10;
commit;


---- Second (concurrent) transaction
begin;
select * from bar join foo on foo.pk = bar.fk where bar.pk = 1 for update of bar;
-- Continue in the first transaction.
-- Returns nothing after first transaction is commited; The row is not locked so we can't update it.
commit;

What did I miss in the docs? Any explanation for this behavior? Ideally with references.

It should be reproducible in a clean DB, no changes in config, read committed.

Solution

You missed this part of the documentation:

UPDATE, DELETE, SELECT FOR UPDATE, and SELECT FOR SHARE commands behave the same as SELECT in terms of searching for target rows: they will only find target rows that were committed as of the command start time. However, such a target row might have already been updated (or deleted or locked) by another concurrent transaction by the time it is found. In this case, the would-be updater will wait for the first updating transaction to commit or roll back (if it is still in progress). If the first updater rolls back, then its effects are negated and the second updater can proceed with updating the originally found row. If the first updater commits, the second updater will ignore the row if the first updater deleted it, otherwise it will attempt to apply its operation to the updated version of the row. The search condition of the command (the WHERE clause) is re-evaluated to see if the updated version of the row still matches the search condition. If so, the second updater proceeds with its operation using the updated version of the row. In the case of SELECT FOR UPDATE and SELECT FOR SHARE, this means it is the updated version of the row that is locked and returned to the client.

(I emphasized the relevant parts.)

When the join operation is evaluated again after the lock is released, the updated row no longer meets the condition.