Search code examples
databasepostgresqldatabase-designdatabase-concurrencymvcc

PostgreSQL - MVCC (multi-version concurrency control) - When is actual lock acquired?


As per my understanding, postgres uses two additional fields Xmin and Xmax to implement mvcc, Lets say we have Employee table with id and name columns.

Below are some crud operations and how they are working concurrently(considering isolation level = READ_COMMITTED) and the question is when and where actual lock is acquired.

  1. Insert -> A new transaction inserts a new record which is not visible to other transactions until it is committed, so no issues in this case and no lock or version control is required. Lets say id = 1, name = "aa" gets inserted. Postgres adds 2 additional columns for mvcc Xmin = current txn id (lets say 100) and Xmax = 0/null.
id  |   name | Xmin | Xmax
------------------------------
1   |   aa   | 100  | null
  1. Update with concurrent Read -

    a). A new transaction started to update name to "bb" (for id = 1). At the same time there is another transaction started to read the same data.

    b). A new Tuple (immutable object in postgres representing a row) gets created with Xmin = current transaction id (Lets say 200) and Xmax = null along with id = 1, name = bb. Also the older version of id = 1 gets updated to have Xmax = 200. Read transaction sees the older version of data with Xmin = 100 and returns. Is there any locking required in this case ? I think no but it might to update the older tuple's Xmax.

Below is same record with multiple versions(just for explanation purpose) with latest version having Xmax = null.

id  |   name | Xmin | Xmax
------------------------------
1   |   aa   | 100  | 200
1   |   bb   | 200  | null
  1. Update with concurrent Update -

    a). Transaction (with txn id = 300) started to update id = 1 to name = cc. Another transaction(txn id = 400) started to update the same record (id = 1) to name = dd. If this scenario also proceeds in same way by creating new tuple and marking old tuple's Xmax then i think it would create problems because both 300 and 400 will create a new tuple and mark the old tuple's Xmax = txn id. An update could loose in this case.

In this scenario does exclusive lock is acquired by first txn and other concurrent update txns wait until any ongoing txn is completed or there is some other way in which postgres handles it ?


Solution

  • Insert -> A new transaction inserts a new record which is not visible to other transactions until it is committed, so no issues in this case and no lock or version control is required.

    This is not true. The inserted tuple is locked upon insertion. This matters if, for example, there is a unique constraint and someone else tries to insert a conflicting tuple.

    Update with concurrent Read....Is there any locking required in this case

    While the xmax is getting updated, there is a "light weight" lock on the buffer holding the tuple which will be released as soon as the field is updated (not held for the duration of the transaction). This light weight lock includes a barrier to make sure any other process will see the change made, rather than seeing stale cached version of it.

    The reader will either see, depending on when it got there, either that xmax is 0 and return the tuple, or it will see that xmax is 200 and see that 200 is not yet committed, and return the tuple anyway because it knows the lock represented by xmax = 200 does not apply to it, a mere reader.

    Update with concurrent Update...

    The first process to write its id into xmax of the to-be-obsolete tuple will win. The second one will see someone else's valid id in xmax, and block until that other transaction commits or rolls back, then decide what to do. Due to the light-weight lock on the buffer holding the tuple, they can't both update xmax without noticing each others changes.