database-design google-cloud-platform google-cloud-spanner

How can you efficiently create a strong ordered sequence in Spanner?

Google Spanner recommends against using things like timestamps or sequential numbers as the initial part of a primary key or index, which makes sense based on the architecture. However, for my requirements, I do need some way of ensuring a strict “append only” ordering of rows.

I’m using Spanner to model events (as in event sourcing). Each event would have a category, a stream id identifying a sequence where events needs to be strictly ordered with relation to each other, and a few payload fields - I'll ignore the actual payload from here on.

Naively, this would be modelled as:

| Category    | STRING       |
| Stream Id   | STRING       |
| Sequence Nr | INT64        |

(With a primary key consisting of Category, Stream Id, Sequence Nr.) This would ensure a strong ordering of events for one stream. Now as some categories have a lot of events associated with them and Spanner best practices is to have variance in the high bits, it would be best to flip this around. Each "stream" will contain a reasonably small number of events (thousands not millions) and will be read together so to facilitate better distribution of data and encourage locality for events belonging to one stream:

| Stream Id   | STRING       |
| Category    | STRING       |
| Sequence Nr | INT64        |

However, as I'd like to be able to append events without having to read the current state to find out the current sequence number, I'd rather use a timestamp.

| Aggregate Id | STRING      |                         | 
| Category     | STRING      |                         |
| Timestamp    | TIMESTAMP   | allow_commit_timestamp  |

Spanner has a commit timestamp built in that would tag it at the time of the transaction actually being processed. But to the question finally:

Is it possible to represent data as above and get unique commit timestamps even if I commit multiple events in one transaction?

If not, is it possible to ensure strict ordering some other way, by adding additional columns to ensure order?

The documentation states that "Commit timestamp values are not guaranteed to be unique. Transactions that write to non-overlapping sets of fields might have the same timestamp. Transactions that write to overlapping sets of fields have unique timestamps." - but I'm not quite clear on what constitutes as "sets of fields" in this context.

The documentation also states that "The commit timestamp makes creating the changelog easier, because the timestamps can enforce ordering of the changelog entries." but it's not clear what the guarantees are around commit timestamps having an enforced order in the context of multiple concurrent writers or multiple events being written at the same time.

Solution

If you have multiple events in the same transaction then they will all have the same commit timestamp.

A field is a table cell (one col value in one row). So 'non-overlapping sets of fields' in this context basically means separate rows, because one of the fields is the commit timestamp!

Two independent transactions, one updating row 'R1' and one updating row 'R2' on the same table can theoretically have the same commit timestamp as they are not overlapping.

Is it possible to represent data as above and get unique commit timestamps even if I commit multiple events in one transaction?

In the example you give where you are using commit timestamp in your primary key, then no, you would not be able to add multiple events to the same stream_id/category pair in a single transaction, as they would have the same timestamp -- and therefore same primary key.

If not, is it possible to ensure strict ordering some other way, by adding additional columns to ensure order?

If you used a combination of the commit timestamp and a sequence_number for each (stream_id, category, timestamp) tuple then you can keep the strict ordering within a single transaction:

Increment a sequence number, starting at 0, for each (stream_id, category) pair in the same transaction. The commit timestamp would then ensure the order across different transactions, and the sequence number would ensure the order within the transaction...