Search code examples
pythongoogle-cloud-spanner

Why am I seeing slow response times from google cloud spanner?


I am in the process of migrating to google cloud spanner for an Django application, on GAE, and found the response times to be very long, in the order of 2-3 seconds for and indexed query which takes 10 ms when running in the spanner query editor.

It is also only the first query that takes this long, and any subsequent queries takes around 100 ms.

I encounter the same issue when running a bare minimum query with the python client, I've attached the code below:

from google.cloud import spanner_v1
client = spanner_v1.Client()

i = client.instance("my-instance")
db = i.database("my-database")

start = time.time()
with db.snapshot() as snapshot:
    QUERY = (
        """
            SELECT
              *
            FROM
              dates
            WHERE 
              date_id = 4507
        """)
    result = snapshot.execute_sql(QUERY)

    for row in result:
        print(row)

end = time.time()
print((end-start)*1000)

which in this case takes 1154 ms. Running the query again (without reinitialising the database object) takes 110ms.

Ill include some outputs from a line-profiler run on the function corresponding to db.snapshot():

Timer unit: 0.001 s

Total time: 1.05145 s
File: /Users/opt/anaconda3/envs/dev/lib/python3.8/site-packages/google/cloud/spanner_v1/database.py
Function: __enter__ at line 859

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   859                                               def __enter__(self):
   860                                                   """Begin ``with`` block."""
   861         1       1051.4   1051.4    100.0          session = self._session = self._database._pool.get()
   862         1          0.0      0.0      0.0          return Snapshot(session, **self._kw)

Running this python==3.8.0 and google-cloud-spanner==3.11.1 on a Mac.. Any help or ideas where to troubleshoot is much appreciated


Solution

  • I think that what you are seeing are down to the following:

    1. The first query will be slower than the rest, as the first query will have to wait for the session pool that is used by the Python client to be created, and for at least one session to have been created and returned by Cloud Spanner. All subsequent queries will reuse sessions from the pool, which is why you are seeing a lot lower values from the second query on.
    2. The total execution time for the second query will depend on the physical distance between your client and the Cloud Spanner instance and the network quality. I assume that the query is selecting one row based on the primary key (i.e. date_id is the only column in the primary key of the table). The client should preferably be running on a Google App Engine instance that is in the same physical region as your Cloud Spanner instance (e.g. both could be in the region us-central1, and not one in us-east1 and one in us-central1).

    If I understand you correctly, the numbers that you provided are based on executions locally from your own development machine. I think that a large part of the execution time difference that you are seeing (compared to the GCP Console) for the second and further queries, are due to network latency between your local machine and Cloud Spanner.

    It would be interesting to see if you get better numbers if you execute the test query from a GAE instance or a simple Google Compute Virtual Machine that is in the same physical region as the Cloud Spanner database.