Search code examples
pythoniogoogle-cloud-firestoremultiprocessing

How to bulk delete specific documents from Firestore in Python


If I have a list of document IDs and I want to delete them from firestore what is the most efficient method to do so? I'm currently using a loop over the list:

document_ids = ["xyz123", "abc0987", "tvu765", ...] # could be up to 30 IDs

for id in document_ids:
   database.collection("documents").document(id).delete()

This is done via an AJAX call from the frontend to a Flask route, once it's done it sends back a response, but when theres 20+ id's it can take a few seconds to complete the process.

Is there a way to say, here, delete these from this collection?


Solution

  • Perhaps you could use multiprocessing to speed-up the io-bound operations.

    Use Multiprocessing To Speed Up

    Refer: YouTube Video

    import multiprocessing
    import time
    from google.cloud import firestore
    
    document_ids = ["xyz123", "abc0987", "tvu765", ...] # could be up to 30 IDs
    def delete_doc(doc_id):
        database = firestore.Client() 
        database.collection("documents").document(doc_id).delete()
    
    t_start = time.perf_counter()
    processes = list()
    for id in document_ids:
        p = multiprocessing.Process(target=delete_doc, args=[id,])
        p.start()
        processes.append(p)
    
    for p in processes:
        p.join()
    
    t_finish = time.perf_counter()
    
    print("Total Elapsed Time: {} s".format(round(t_finish - t_start, 3)))