Search code examples
sqlpostgresqltransactionslockingupsert

Postgres / SQL pattern for atomic SELECT and UPDATE or INSERT within a transaction


I have a simple scenario, where I want to atomically read and modify the state of a row. But the row may not exist yet.

For this example, I use the user_group_membership table:

user_id (pk) | group_id (pk) | state
-------------------------------------
1            | 3             | joined
  • User 1 is member of Group 3 with state joined (can also be invited or left or banned).
  • User 2 is not a member of Group 3. And never has been a member of that Group since there's no row in the table

The state value works like a state machine. There's a limited set of transitions:

null (no row present) -> invited, banned
invited -> joined, banned
joined  -> left, banned
left    -> invited, banned
banned  -> invited, left

If a row is already present I can use a SELECT ... FOR UPDATE to get the current state, validate the transition, update the state and commit the transaction. All other concurrent transactions will "wait" for the lock to be released. That's fine. In this case all state transitions run sequentially.

But if there is no row in the table, there's nothing to lock. So all concurrent transactions will try to execute an INSERT. The first will succeed and the rest will fail because of the duplicate primary key.

At this point I just could "rerun" the whole code, because now I know that the row exists and it will use the SELECT ... FOR UPDATE for locking/waiting. But I don't want to execute the same code twice. I'm looking for a more elegant solution.

What I came up with so far

This is a replacement for the SELECT ... FOR UPDATE:

INSERT INTO user_group_membership (user_id, group_id, state)
VALUES (2, 3, 'DUMMY_FOR_THE_ROW_LOCK')
ON CONFLICT (user_id, group_id) DO UPDATE
SET user_id = EXCLUDED.user_id
RETURNING *;

-- application code for validating state transition

UPDATE user_group_membership 
SET state = 'INVITED'
WHERE user_id = 2 AND group_id = 3;
  • This should prevent the situation where multiple concurrent transactions try to INSERT and will hit a duplicate key error.

  • The DO UPDATE part basically is a no-op, but it seems to be necessary to get RETURNING to work properly. This effectively replaces the SELECT.

Questions

  • Is this the right way to handle this scenario?
  • Is it "safe"?
  • Is there a better / easier solution?

Followup questions

  • How to properly handle dummy values? The state column is not nullable and of type enum (invited, joined, left, banned). Introducing a new enum value, which should never be used outside of this locking mechanism feels wrong. But I need some value to create and lock the row. Any ideas?

Solution

  • How to properly handle dummy values? The state column is not nullable

    Don't use them. If you do, make it nullable and denote lack of user-group relation with a state is null.

    Is this the right way to handle this scenario?
    Is it "safe"?
    Is there a better / easier solution?

    If it works, it works. It's not unsafe but magic/dummy/flag/trip/rogue/signal/sentinel values aren't very elegant.

    Make the selection, validation and upsert a single operation:
    demo at db<>fiddle

    prepare find_validate_apply(int,int,text) as
    with find as(
      select state
      from user_group_membership as f
      where $1=user_id and $2=group_id
      for update of f
      --limit 1--unncessary given the uniqueness and non-nullability
    ),empty_as_null as(--`coalesce()` for rows
     (select state from find)
      union(select null)
      order by 1 nulls last limit 1
    ),validate as(
      select exists(select from allowed_transitions as t
                    where t.source_state is not distinct from found.state
                    and t.target_state is not distinct from $3) 
             as is_transition_allowed
      from empty_as_null as found
    ),apply as(
      insert into user_group_membership
      select $1,$2,$3
      from validate 
      where is_transition_allowed
      on conflict(user_id,group_id)do update
      set state=$3
      returning *)
    select*from apply;
    

    Now each worker can wait for others to finish their whole thing before they attempt to apply their desired transition. Note that this doesn't require a dummy value or sentinel row - they lock each other out either in the initial select or in the final upsert (concurrent writes to the unique pk index), but the whole thing is one atomic operation so it doesn't matter which one exactly.

    The demo uses an audit trigger to track operations on the table and dblink to run parallel workers - at the end you can see how they queued up and applied their changes one after another:

    create function trg1_f()returns trigger as $f$
    begin
      create table if not exists user_group_membership_history
      as select TG_OP,transaction_timestamp(),clock_timestamp()
               ,OLD.user_id as OLD_user_id
               ,OLD.group_id as OLD_group_id
               ,OLD.state as OLD_state
               ,NEW.* 
      limit 0;
    
      insert into user_group_membership_history
      select TG_OP,now(),clock_timestamp(),OLD.*,NEW.*;
      
      return new;
    end $f$ language plpgsql;
    
    create trigger trg1 
    after insert or update or delete
    on user_group_membership
    for each row execute function trg1_f();
    
    create extension dblink;
    select dblink_connect('another_worker1','');--defaults to localhost here
    select dblink_connect('another_worker2','');
    select dblink_send_query('another_worker1',
      $q$ begin;
          select pg_sleep(0.5);
          select find_validate_apply(9,9,'invited');
          select pg_sleep(1);
          commit;
      $q$);
    
    select dblink_send_query('another_worker2',
      $q$ begin;
          select pg_sleep(0.7);
          select find_validate_apply(9,9,'banned');
          select pg_sleep(1);
          commit;
      $q$);
    select pg_sleep(4);
    
    table user_group_membership_history;
    
    user_id group_id state
    9 9 banned
    tg_op transaction_timestamp clock_timestamp old_user_id old_group_id old_state user_id group_id state
    INSERT 2024-11-13 16:18:18.929716+00 2024-11-13 16:18:19.94236+00 null null null 9 9 invited
    UPDATE 2024-11-13 16:18:18.932864+00 2024-11-13 16:18:20.946128+00 9 9 invited 9 9 banned

    The allowed_transitions idea is self-explanatory:

    create table allowed_transitions(source_state,target_state)
    as values
     (null,'invited'),(null, 'banned')
    ,('invited','joined'),('invited','banned')
    ,('joined','left'),('joined','banned')
    ,('left','invited'),('left','banned')
    ,('banned','invited'),('banned','left');
    alter table allowed_transitions 
      add constraint uniq unique(source_state,target_state);