Search code examples
postgresqlpostgresql-9.6

Unexpected failures using Serializable transaction level in Postgres


We're developing a lightweight CRUD app and have opted to use the Serializable isolation level for our transactions.

However, when ramping up the load on our environment we're seeing large numbers of failures for transactions that we didn't think would cause any problems. In particular, we have one transaction which we've managed to strip down to the following and still replicate problems:

transaction(Connection.TRANSACTION_SERIALIZABLE, 3) {
    val record = MyRecord(UUID.randomUUID(), UUID.randomUUID(), DEFAULT_JSON)
    myDao().getRecord(record.id)
    myDao().addRecord(record)
}

which translates into SQL as:

SELECT mytable.id, mytable.userId, mytable.json, mytable.deleted_at 
FROM mytable 
WHERE mytable.id = '93ea4a65-cd52-4d73-ae74-38055c1b066b'

INSERT INTO mytable (deleted_at, json, id, user_id) 
VALUES (NULL, '{"version":7}', '93ea4a65-cd52-4d73-ae74-38055c1b066b', '026d3c48-cdc5-4748-927b-408712e00f89')

I.e, a simple retrieve-then-insert via a PRIMARY KEY UUID column. When ramping this up (e.g. 40 threads, each running 50 of the transactions in a row), we're seeing the vast majority of them fail with the following exception:

o.p.u.PSQLException: ERROR: could not serialize access due to read/write dependencies among transactions

Detail: Reason code: Canceled on identification as a pivot, during write.
Hint: The transaction might succeed if retried.

at o.p.c.v.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2433)
at o.p.c.v.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2178)
at o.p.c.v.QueryExecutorImpl.execute(QueryExecutorImpl.java:306)
at o.p.jdbc.PgStatement.executeInternal(PgStatement.java:441)
at o.p.jdbc.PgStatement.execute(PgStatement.java:365)
at o.p.j.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:155) at o.p.j.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:132) at o.j.e.s.s.InsertStatement.execInsertFunction(InsertStatement.kt:86) at o.j.e.s.s.InsertStatement.executeInternal(InsertStatement.kt:95) at o.j.e.s.s.InsertStatement.executeInternal(InsertStatement.kt:12) at o.j.e.s.s.Statement.executeIn$exposed(Statement.kt:59) ... 90 common frames omitted Wrapped by: o.j.e.e.ExposedSQLException: org.postgresql.util.PSQLException: ERROR: could not serialize access due to read/write dependencies among transactions Detail: Reason code: Canceled on identification as a pivot, during write. Hint: The transaction might succeed if retried. at o.j.e.s.s.Statement.executeIn$exposed(Statement.kt:61) at o.j.e.s.Transaction.exec(Transaction.kt:129) at o.j.e.s.Transaction.exec(Transaction.kt:123) at o.j.e.s.s.Statement.execute(Statement.kt:29) at o.j.e.sql.QueriesKt.insert(Queries.kt:44) at g.c.e.d.MyDao.insertEvent(DefaultEventsDao.kt:40) ... 81 common frames omitted

Digging into pg_locks whilst the threads are running, we can see:

| locktype      | database   | relation   | page   | tuple   | virtualxid   | transactionid   | classid   | objid   | objsubid   | virtualtransaction   | pid   | mode             | granted   | fastpath 

| page          | 18496      | 17542      | 2      | <null>  | <null>       | <null>          | <null>    | <null>  | <null>     | 30/75                | 1467  | SIReadLock       | True      | False       
| page          | 18496      | 17542      | 5      | <null>  | <null>       | <null>          | <null>    | <null>  | <null>     | 34/45                | 1471  | SIReadLock       | True      | False      
| page          | 18496      | 17542      | 2      | <null>  | <null>       | <null>          | <null>    | <null>  | <null>     | 8/335                | 1446  | SIReadLock       | True      | False      
| page          | 18496      | 17542      | 1      | <null>  | <null>       | <null>          | <null>    | <null>  | <null>     | 31/65                | 1468  | SIReadLock       | True      | False      
| page          | 18496      | 17542      | 6      | <null>  | <null>       | <null>          | <null>    | <null>  | <null>     | 43/15                | 1480  | SIReadLock       | True      | False      
| page          | 18496      | 17542      | 4      | <null>  | <null>       | <null>          | <null>    | <null>  | <null>     | 5/357                | 1482  | SIReadLock       | True      | False      
| page          | 18496      | 17542      | 6      | <null>  | <null>       | <null>          | <null>    | <null>  | <null>     | 41/15                | 1478  | SIReadLock       | True      | False       
| page          | 18496      | 17542      | 6      | <null>  | <null>       | <null>          | <null>    | <null>  | <null>     | 40/30                | 1477  | SIReadLock       | True      | False   

Where relation 17542 corresponds to the pkey of our table (verified by querying pg_class). So it appears that the transactions are wanting a page lock for the SELECT, and are therefore failing because other inserts into the same page are happening concurrently.

This story gains weight as we've noticed that as the table grows, the failure rate reproduced by our test decreases (the records are split over more pages and so less collisions occur).

My questions, then, are:

  • Is this expected behaviour in Postgres? Shouldn't a simple pkey lookup need at most a tuple lock, not a page lock?
  • Is this something we should expect to be able to do on Serializable isolation level? Lowering to repeatable read eliminates the problem but we're reluctant to do that without understanding more.
  • Is there anything obvious we should be doing to help Postgres in this situation? E.g. an annotation for the query or a setting that we could enable?

We are using Exposed within a Ktor backend written in Kotlin, in case that's relevant. Out Postgres version is 9.6.


Solution

  • This is working as expected:

    For optimal performance when relying on Serializable transactions for concurrency control, these issues should be considered:

    [...]

    In your test case, there are three or more predicate lock on one table page, so the lock is escalated to a page lock. That is why the transactions conflict with each other (they affect the same page).

    Try increasing max_pred_locks_per_page.