Search code examples
pymongo

Smart way to update all documents in mongodb using python function


I execute some_python_function on all elements of collection. This function returns different values for each document.

I developed the following function, but it's very slow.

for doc in db.collection.find(query, projection):
    result = db.collection.update_one(
        {"_id": doc["_id"]}, 
        {"$set": {"field": some_python_function(doc["field"])}}
    )

I am looking any smarter way to do it, rather than updating documents one-by-one.

What would you recommend?

EDIT: I have just found bulk operations in the API: https://pymongo.readthedocs.io/en/stable/examples/bulk.html

from pymongo import UpdateOne

updates = []
for doc in db.collection.find(filter, projection):
    if doc.get("titles"):
        updated_field = some_python_function(doc["field"])
        if doc["field"] != updated_field:
            updates.append(
                UpdateOne(
                    {"_id": doc["_id"]}, 
                    {"$set": {"field": updated_field)}}
                )
            )
if updates:
    result = collection.bulk_write(updates)


Solution

  • Use bulkWrite to write multiples document at once.

    Here is an answer for a similar question.