Search code examples
mongodbforeachbulkupdate

MongoDB bulk update efficiency using forEach


How would you approach bulk / batch updating documents (Up to 10k docs) coupled with forEach? (No specific criteria to update by, used for random document selection)

I'm looking at two options:

  1. Collect all document _id in the forEach closure into an array and afterwards update using collection.update({_id : {$in : idsArray}}, ...)
  2. Add update queries in the forEach closure to a bulk operation and execute once done, somewhere along the lines of bulk.find({_id: doc.id}).updateOne({...}); bulk.execute();

I'm going to benchmark this soon, but I would like to know what's more I/O efficient and considered 'smart' with Mongo.


Solution

  • OK, so I've benchmarked the two options.

    TL;DR option one is twice as fast, so collect ids and update once.

    for future reference, some more details for :

    • Total number of documents in db is around 500k.
    • Documents contain around 20-25 fields each.
    • Did an update on 10-30k documents.

    Results (times are machine specific, but the relative difference is what matters):

    1. One update with ids array: 200-500ms.
    2. Bulk update: 600-1000ms.

    Looking back, I thought bulk might be faster because maybe there was some hidden optimization. But I understand that the question was missing logic, less operations probably means faster, hence bulk is slower.