google-cloud-platform google-cloud-spanner

Automatic created_on (not updated_on) timestamp on insert (not update) in Google Spanner

As far as I can tell, there is no safe way (in an concurrent environment) to do this, but I wanted to make sure I'm not missing anything.

Often, in our DBs, we like to track when a row was originally created and when it was last updated. Separately. This is not a "created_at" column that should actually be called "updated_on".

In Spanner, with commit timestamps (or even just always putting the current time), updated_on is easy. However, the usual tools I use for created_on:

default values, and never update
on duplicate key ...
triggers

don't seem to be available. I guess, maybe you could set up a cloud function, that seems like overkill (ironic that a cloud function would be overkill...).

The closest thing I can come with that's not totally odd is to try an insert mutation, catch an exception, check for ErrorCode.ALREADY_EXISTS, and then update. And only set created_on in the insert block. Ugggly... and also not really safe in the face of concurrent deletes (you insert, you catch error, someone deletes in between, try to update, boom)

Any other suggestions? Preferably via the SDK?

Solution

Note that, as clarified in my comment, I was looking for something clean, not a transaction, since the way transactions work is a bit ugly (the interface should at least should have lambdas as an option, instead anonymous classes :/.) Or better, just beginTransaction(), endTransaction(), e.g. :

            this.dbClient.readWriteTransaction()
            .run(
                new TransactionCallable<Void>() {
                    @Nullable
                    @Override
                    public Void run(TransactionContext transactionContext) throws Exception {
                        Struct row = transactionContext.readRow(
                            MY_TABLE,
                            Key.of(
                                keyCol1,
                                keyCol2
                            ),
                            Collections.singletonList(keyCol1)
                        );
                        //this creates a write builder with a bunch of columns
                        //set to mutated, except for CREATED_ON
                        WriteBuilder writeBuilder = updateBuilder(
                            Mutation.newInsertBuilder(MY_TABLE),
                            myDataModel
                        );
                        if(row == null) {
                            writeBuilder.set(CREATED_ON).to(Timestamp.now()).build();
                        }
                        Mutation recMut =
                            updateBuilder(Mutation.newUpdateBuilder(MY_TABLE), myDataModel).build();
                        transactionContext.buffer(recMut);
                        return null;
                    }
                }
            );

@RedPandaCurious is correct, that Scott's answer only half-works: (1) is bound to fail, for reasons outlined in the question - or taken another way, just re-states what I want to accomplish, without illustrating how (2) is just re-stating my follow-up comment, without providing any more details or docs.

@RedPandaCurious, if you want to note that transactions are faster than catching the exception, with some docs around that (I'm particularly curious if they are faster, overall, for a variety of workloads, in the face of many many concurrent operations, not necessarily just faster for the one client handling the exception), that makes sense as an answer-ish. In the end, though, transactions are the most correct, sane way to reason about it. Which is why I ended up with that approach - since either way was ugly.

OK, it turns out, if you remove the @Nullable annotation, you can use lambdas, and, with a little additional re-factoring, reduce this to:

 /**
 * Lambda interface for factoring out transactional execution logic
 */
 public interface SpannerOperation {
      Boolean doOperation(TransactionContext ctxt, Struct row);
  }

  private Boolean executeIfExists(
        ... key fields ...
        SpannerOperation spannerOperationIfExists,
        SpannerOperation spannerOperationifNotExists,
        Iterable<String> columns
    ) {
        return this.dbClient.readWriteTransaction().run(
            transactionContext -> {
                Struct row = transactionContext.readRow(
                    MY_TABLE,
                    Key.of(...), //you could even pass the key in as a key
                    columns
                );
                if(row != null) {
                    spannerOperation.doOperation(transactionContext, row);
                    return true;
                } else {
                   spannerOperationifNotExists.doOperation(transactionContext, null);
                    return false;
                }
            }
        );
    }

     public boolean doSomething(... keyStuff.. )
        return this.executeIfExists(
            .. key fields ...
            (ctxt, row) -> {
                Mutation mut = Mutation
                    .newUpdateBuilder(MY_TABLE)
                    .....//as you like it...
                    .build()
                ctxt.buffer(mut);
                return true;
            },
            (ctxt, row) -> false, //set created_on or whatever
            Collections.singleton(..some column you want to read in..)
        );

Note that this also works for stuff like appending to a list, etc, and it all gets factored down to just what you need. Google really needs an ifExists() method - I ended up using this in quite a few places.. ;)