How to deal with timeout errors when trying to retrieve all documents from Google Cloud Firestore?

Given the example from Cloud Firestore documentation, let's say I want to gather all population numbers of the cities, I'd do:

docs = db.collection("cities").stream()
populations = {}
for doc in docs:
    doc_dict = doc.to_dict()
    populations[doc_dict["name"]] = doc_dict["population"]

However, there are so many documents in my case that I got google.api_core.exceptions.ServiceUnavailable: 503 The datastore operation timed out, or the data was temporarily unavailable.

I checked answers of Timeout for firestore operations and Is it possible to increase the response timeout in Google App Engine?, and I learned that there's no way that I can change the timeout period.

This leaves me a question: how can I then get all the population numbers? The document ID is not incremental so I can't memorize "where I was before timeout", and I didn't find "cursor"-like solutions from the Firestore documentation.

Solution

Cloud Firestore supports pagination. There is even a python code sample in the documentation. You will want to pay attention to the section on using a document snapshot to define the query cursor.