Search code examples
google-cloud-platformgoogle-cloud-spanner

Does binning for monotonically increasing value improve the performance of Google Cloud Spanner?


Google Cloud Spanner recommend us to avoid putting an index onto a monotonically increasing (non-PK) column such as timestamps (https://cloud.google.com/spanner/docs/schema-design), but my specification requires to query by (monotonically increasing) timestamp column.

I'm planning to detour the limitation by binning the time axis every 1 minute (i.e. 10:00:35 -> 10:00:00). Does this work fine for Google Cloud Spanner?


Solution

  • It is generally not recommended to have a timestamp column as the first index column because as inserts are done in increasing timestamp order, they can hotspot the single server serving the end of the key space. One option to mitigate this hotspotting is to use sharding: https://cloud.google.com/spanner/docs/schema-design#fix_hash_the_key

    This may also require modifying the query to query for timestamps (within the desired range) across all shards to fit the use case.