google-cloud-platform google-cloud-spanner

Low read throughput in Cloud Spanner

I have a database populated with 100M rows with simple keys and values. The primary key is just a random 32-byte string, and the value is a 32-byte string. (It's quite similar to YCSB, although smaller).

I'm seeing wildly inconsistent throughput for a single node doing point reads. I'm seeing up to 15k QPS for a single node, but sometimes I'm seeing much lower throughput. The higher QPS seems to be the result of querying for a smaller subset of the keys. Is it possible that I'm running into some strange caching behavior?

Solution

Caching (i.e. caching data from secondary storage) should not affect your performance so severely, and it generally can be ignored in most performance discussions for Cloud Spanner. However, Cloud Spanner does have a query cache, which might be part of the issue here.

There are a few factors that could affect your performance so severely:

1) If you are using SQL queries for your point reads, make sure you are using query parameters. In other words, make sure you are populating the params and paramTypes fields in your executeSql requests. This will improve performance for queries and also provide better security. More information is available in this whitepaper on query performance.

2) If you are running a loadtest, make sure you run your workload for at least 30 minutes to ensure that Spanner has the chance to optimize the distribution of your data by balancing (and creating new) splits across your nodes.

Note that you should be able to see great read performance at any level of freshness (e.g. Strong Reads), and you may see a slight bump up if you use Bounded-Staleness.