Search code examples
google-cloud-datastoreapp-engine-ndb

Most efficient way to delete thousands of entities in datastore


This is how I delete thousands of entities in datastore: First, get 1st entity. If 1st entity exist, fetch 500 entities to delete. Second, defers deletealltarget again and again until 1st entity does not exist.

def deletealltarget(twaccount_db_key):
  target_db = model.Target.query().filter(ndb.GenericProperty('twaccount_key') == twaccount_db_key).get()
  if target_db:
    target_dbs = model.Target.query().filter(ndb.GenericProperty('twaccount_key') == twaccount_db_key).fetch(500,keys_only=True)
    ndb.delete_multi(target_dbs)
    deferred.defer(deletealltarget,twaccount_db_key)

Is there any better way?


Solution

  • You could use delete_multi_async asynchronously, instead of using defer, but besides that, you are doing good with this way. For example, you are using other advices already told, like using keys_only.

    Google recommends using this Dataflow template for bulk deletion., but I don't know if it fits your scenario.