Search code examples
google-app-enginegoogle-cloud-datastoreapp-engine-ndb

What is being done in all this non-rpc time for a fetch?


Say I have some model Object, and ~2000 entities in my datastore. Using app stats with the following code

ndb.get_context().set_cache_policy(lambda x: False)
ndb.get_context().set_memcache_policy(lambda x: False)
objects = Object.query().fetch()

I get the following profile

Query profile

What is being done for the ~18 seconds that it's not waiting for RPCs?


Solution

  • It's likely deserializing those entities into python objects and the process is very very slow. You shouldn't be fetching that many entities during a single web request from a client anyway and if it's for some kind of a batch job - the time shouldn't matter that much (note also that once you go over several thousand items - your requests will likely time out at some point so you will need to use something like query cursors).

    You may also find this and this blog posts helpful on some hacks to speed the deserialization process up.

    Also, unrelated but this is one of the many cases where Golang would shine and way outperform Python on the exact same task (that delay would be almost non-existent).