I am in the process of migrating to google cloud spanner for an Django application, on GAE, and found the response times to be very long, in the order of 2-3 seconds for and indexed query which takes 10 ms when running in the spanner query editor.
It is also only the first query that takes this long, and any subsequent queries takes around 100 ms.
I encounter the same issue when running a bare minimum query with the python client, I've attached the code below:
from google.cloud import spanner_v1
client = spanner_v1.Client()
i = client.instance("my-instance")
db = i.database("my-database")
start = time.time()
with db.snapshot() as snapshot:
QUERY = (
"""
SELECT
*
FROM
dates
WHERE
date_id = 4507
""")
result = snapshot.execute_sql(QUERY)
for row in result:
print(row)
end = time.time()
print((end-start)*1000)
which in this case takes 1154 ms. Running the query again (without reinitialising the database object) takes 110ms.
Ill include some outputs from a line-profiler run on the function corresponding to db.snapshot():
Timer unit: 0.001 s
Total time: 1.05145 s
File: /Users/opt/anaconda3/envs/dev/lib/python3.8/site-packages/google/cloud/spanner_v1/database.py
Function: __enter__ at line 859
Line # Hits Time Per Hit % Time Line Contents
==============================================================
859 def __enter__(self):
860 """Begin ``with`` block."""
861 1 1051.4 1051.4 100.0 session = self._session = self._database._pool.get()
862 1 0.0 0.0 0.0 return Snapshot(session, **self._kw)
Running this python==3.8.0 and google-cloud-spanner==3.11.1 on a Mac.. Any help or ideas where to troubleshoot is much appreciated
I think that what you are seeing are down to the following:
date_id
is the only column in the primary key of the table). The client should preferably be running on a Google App Engine instance that is in the same physical region as your Cloud Spanner instance (e.g. both could be in the region us-central1
, and not one in us-east1
and one in us-central1
).If I understand you correctly, the numbers that you provided are based on executions locally from your own development machine. I think that a large part of the execution time difference that you are seeing (compared to the GCP Console) for the second and further queries, are due to network latency between your local machine and Cloud Spanner.
It would be interesting to see if you get better numbers if you execute the test query from a GAE instance or a simple Google Compute Virtual Machine that is in the same physical region as the Cloud Spanner database.