I execute some_python_function
on all elements of collection.
This function returns different values for each document.
I developed the following function, but it's very slow.
for doc in db.collection.find(query, projection):
result = db.collection.update_one(
{"_id": doc["_id"]},
{"$set": {"field": some_python_function(doc["field"])}}
)
I am looking any smarter way to do it, rather than updating documents one-by-one.
What would you recommend?
EDIT: I have just found bulk operations in the API: https://pymongo.readthedocs.io/en/stable/examples/bulk.html
from pymongo import UpdateOne
updates = []
for doc in db.collection.find(filter, projection):
if doc.get("titles"):
updated_field = some_python_function(doc["field"])
if doc["field"] != updated_field:
updates.append(
UpdateOne(
{"_id": doc["_id"]},
{"$set": {"field": updated_field)}}
)
)
if updates:
result = collection.bulk_write(updates)
Use bulkWrite to write multiples document at once.
Here is an answer for a similar question.