Search code examples
sqlpostgresqlvert.x

Tuple concurrently updated when granting permissions


Struggling with database queries - not a db expert by any means, any help would be appreciated.

When dynamically created databases and schemas, once in awhile I get this error:

Unable to apply database grants.
io.vertx.core.impl.NoStackTraceThrowable: Error granting permission.
io.vertx.pgclient.PgException:
ERROR: tuple concurrently updated (XX000)

The role names, database names and schema names are replaced in the query strings in a separate place, i modified the code to pass in the query string directly to the transaction for simplicity.

The permissions being granted are as follows:

private static final String ERR_PERMISSION_GRANT_ERROR_MESSAGE = "Error granting permission. ";
private static final String ADVISORY_LOCK = "SELECT pg_try_advisory_lock("
        + String.valueOf(BigInteger.valueOf(Double.valueOf(Math.random()).longValue())) + ")";  
private static final String CREATE_USER = "CREATE ROLE <role-name> LOGIN PASSWORD <pwd>;";
private static final String GRANT_PERMISSION1 = "GRANT CREATE, CONNECT ON DATABASE <db-name> TO <role-name>;";
private static final String GRANT_PERMISSION2 = "GRANT USAGE ON SCHEMA <schema-name> TO <role-name>;";
private static final String GRANT_PERMISSION3 = "GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA <schema-name> TO <role-name>";
private static final String GRANT_PERMISSION5 = "ALTER DEFAULT PRIVILEGES IN SCHEMA <schema-name> GRANT ALL ON SEQUENCES TO <role-name>;";


private static Promise<Boolean> grantDatabase(PgPool pool, String databaseName, String userName, String schemaName,
        Vertx vertx) {
    Promise<Boolean> promise = Promise.promise();

    pool.getConnection()
            // Transaction must use a connection
            .onSuccess(conn -> {
                // Begin the transaction
                conn.begin().compose(tx -> conn
                        // Various statements
                        .query(updateQueryString(ADVISORY_LOCK, databaseName, userName)).execute()
                        .compose(
                                res1 -> conn.query(
                                        updateQueryString(GRANT_PERMISSION1 databaseName, userName))
                                        .execute()
                                        .compose(res2 -> conn.query(
                                                updateQueryString(GRANT_PERMISSION2, schemaName, userName))
                                                .execute()
                                                .compose(res3 -> conn
                                                        .query(updateQueryString(
                                                                GRANT_PERMISSION3, schemaName, userName))
                                                        .execute()
                                                        .compose(res4 -> conn
                                                                .query(updateQueryString(GRANT_PERMISSION5,
                                                                        schemaName, userName))
                                                                .execute()))))
                        // Commit the transaction
                        .compose(res5 -> tx.commit()))
                        // Return the connection to the pool
                        .eventually(v -> conn.close()).onSuccess(v -> promise.complete(Boolean.TRUE))
                        .onFailure(err -> promise
                                .fail(ERR_PERMISSION_GRANT_ERROR_MESSAGE
            });
    return promise;
}

How do I fix the tuple concurrently updated error in this case? I only have a single instance of my service running.

PostgreSQL v14.6 (Homebrew) vertx-pg-client 4.3.8


Solution

  • You've probably already found this and established the error is caused by the non-zero chance of two of your threads trying to run those queries at the same time. They could be also competing with something else - missing commas and parentheses suggest the code you showed is not 1:1 what you're running, plus you could have more grant/revoke/alter elsewhere.

    I think your plan to use advisory locks is better than the alternative of establishing a separate "grant queue" or trying to track and lock system tables.


    Locking approach

    private static final String ADVISORY_LOCK = "SELECT pg_try_advisory_lock(" 
    /* ... */
    .query(updateQueryString(ADVISORY_LOCK, databaseName, userName)).execute()
                            .compose( /* ... */
    

    You might want to change your advisory lock function.

    • pg_advisory_lock() would make it wait for the lock if it's not available.

    • pg_try_advisory_lock() instead of making the client wait for lock to become available, returns false. I don't see the code in any way responding to the result of true if it got the lock or false if it didn't, which means that it just tries to acquire the lock and ignores the outcome, continuing regardless.

    • Both of the above obtain a session-level lock, so it won't be released unless you call pg_advisory_unlock() on the same ID. Lock obtained from pg_advisory_xact_lock() and pg_try_advisory_lock() would be released automatically at commit/rollback.

      With a standalone connection, conn.close() should end the session which triggers the db to lift both session- and transaction-level locks it held. With a pool, it could live on after released, still holding the locks unless it happens to get cleaned up by a pool configured to do so.


    ID used for locking

    Your use of Math.random() seems to always result in a 0 because of narrowing primitive conversion in Double.longValue()

    String.valueOf(         //BigInt to String
       BigInteger.valueOf(  //Long   to BigInt, 
          Double.valueOf(   //Double to Double
              Math.random() //returns between 0.0 and 1.0
          ).longValue()     //Double to Long, basically flooring it to 0
       )
    )
    

    Which means you're already always re-using a static ID.

    But in case you tried to randomise the ID to make each thread use a different, unique lock id, they wouldn't be able to block each other. Threads need to use the same lock ID in reference to the same "action" that could interfere with the other threads if they attempted it at the same time.

    private static final String ADVISORY_LOCK = "SELECT pg_try_advisory_lock("
      + String.valueOf(BigInteger.valueOf(Double.valueOf(Math.random()).longValue()))
      + ")"; 
    
    --Random lock ID generated as 99:
    /*1st agent:*/ SELECT pg_try_advisory_lock(99);
    --lock acquired, 1st agent proceeds to run its queries
    
    --In parallel, 2nd agent gets a random ID of 77:
    /*2nd agent:*/ SELECT pg_try_advisory_lock(77);
    --77 wasn't locked, so it immediately proceeds to attempt the same action 
    --as the 1st agent, disregarding the fact that it can make them compete
    --and result in `ERROR: tuple concurrently updated`
    

    Aside from swapping pg_try_advisory_lock() for a pg_advisory_xact_lock() I think replacing that Math.random() with a static, arbitrary number, will be enough:

    private static final String ADVISORY_LOCK = "SELECT pg_advisory_xact_lock("
            + "123456789"
            + ")"; 
    
    --now everyone trying to run those particular queries checks the same ID
    /*1st agent:*/ SELECT pg_advisory_xact_lock(123456789);
    --noone called dibs on that ID so far, so it's allowed to proceed
    
    --In parallel, 2nd agent enters the same subroutine and asks about the same ID:
    /*2nd agent:*/ SELECT pg_advisory_xact_lock(123456789);
    --1st agent hasn't released the lock on that ID yet, so 2nd agent waits
    

    If competing parts of your app were initialising their own Random() with the same, shared seed, or re-starting a shared Random(), they'd get the same ID - but that's only trading a predefined, static ID for a predefined seed.

    Random, unique lock IDs could be useful to avoid accidental ID re-use for some unrelated action and to free you from having to keep track of what ID was used where. However, those IDs would have to be generated ahead of runtime or during each initialisation.