Search code examples
sqlpostgresqlconcurrencytransactionsacid

How to pick transaction isolation levels?


I have a table in database that is responsible for storing ordered/reorderable lists. It has the following shape:

| id | listId | index | title | ... |

where id is primary key, listId is foreign key that identifies what list the item belongs to, title and other columns are contents of items. index property is responsible for position of item in list. It is an integer counter (starting with 0) that is unique in the scope of the list, but may repeat across lists. Example data:

| id      | listId  | index | title    | ...
---------------------------------------------
| "item1" | "list1" | 0     | "title1" | ...
| "item2" | "list1" | 1     | "title2" | ...
| "item3" | "list1" | 2     | "title3" | ...
| "item4" | "list2" | 0     | "title4" | ...
| "item5" | "list2" | 1     | "title5" | ...

Users can create/delete items, move them inside the list or across lists. To ensure consistency of indexes when running these operations, I do the following:

Create item:

  1. Count items within this list
SELECT COUNT(DISTINCT "Item"."id") as "cnt" 
FROM "item" "Item" 
WHERE "Item"."listId" = ${listId}
  1. Insert new item, with index set to count from step 1:
INSERT INTO "item"("id", "listId", "index", "title", ...) 
VALUES (${id}, ${listId}, ${count}, ${title})

This way index grows with each item inserted into the list.

Move item:

  1. Retrieve item's current listId and index:
SELECT "Item"."listId" AS "Item_listId", "Item"."index" AS "Item_index" 
FROM "item" "Item" 
WHERE "Item"."id" = ${id}
  1. Change index of "shifted" items if necessary, so that order is consistent, e.g. given the item is moved forward, all items between its current position (exclusively) and its next position (inclusively) need to have their index decreased by 1:
UPDATE "item" 
  SET "index" = "index" - 1 
WHERE "listId" = ${listId} 
  AND "index" BETWEEN ${sourceIndex + 1} AND ${destinationIndex}

I'll omit the variation with movement across lists because it is very similar.

  1. Update the item itself:
UPDATE "item" 
   SET "index" = ${destinationIndex} 
WHERE "id" = ${id}

Delete item:

  1. Retrieve item's index and listId

  2. Move all items in same list that are next to this item 1 step back, to remove the gap

UPDATE "item" 
  SET "index" = "index" - 1 
WHERE "listId" = ${listId} 
  AND "index" > ${itemIndex}
  1. Delete item:
DELETE FROM "item" 
WHERE "id" = ${id}

Question is:

What transaction isolation levels should I provide for each of these operations? It is very important for me to keep index column consistent, no gaps and most importantly - no duplicates. Am I getting it right that create item operation is subject to phantom reads, because it counts items by some criteria, and it should be serializable? What about other operations?


Solution

  • Without knowing more about your specific application, the safest bet is indeed to use serializable as isolation level whenever you access that table but even that level may not be sufficient for your specific case.

    A unique constraint on (listId, index) would prevent duplicates (what about the title? Can it be repeated in the same list?), some accurately crafted "watchdog" queries can further mitigate issues and database sequences or stored procedures can ensure that there are no gaps but truth is the mechanism itself seems fragile.

    Knowing only so much of your specific problem, what you appear to have is a concurrency problem at user level in the sense that several users can access the same objects at the same time and make changes on them. Assuming this is your typical web-application with a stateless back-end (hence inherently distributed) this may carry a large amount of implications in terms of user experience reflecting on the architecture and even functional requirements. Say for example that user Foo moves item Car to List B which is currently being worked on by user Bar. It is then legit to assume that Bar will need to see item Car as soon as the operation is completed, but that will not happen unless there's some mechanism in place to immediately notify users of List B of the change. The more users you have working on the same set of lists, the worse it becomes even with notifications as you would have more and more of them up to the point where users see things changing all the time and just can't keep up with it.

    There's a lot of assumptions anyone will make to provide you an answer. My own lead me to state that you probably need to revise the requirements for that application or ensure that management is aware of several limitations and that they accept them. This type of problem is pretty common in distributed applications. Usually "locks" on certain sets of data are placed (either through database or shared memory pools) so that only one user can alter them at any given time or, alternatively, a workflow is provided to manage conflicting operations (much like versioning systems). When neither is done, a log of operations is kept to understand what happened and rectify problems later on should they be detected.