Search code examples
mongodbmongodb-querypymongo

Bulk update using pymongo, merging python dictionary and DB collection


As an example, suppose we have the following dictionary list in memory

dict_list = [
  {
    'id': 1,
    'from_value_key': 'x'
  },
  {
    'id': 2,
    'from_value_key': 'y'
  },
  {
    'id': 4,
    'from_value_key': 'z'
  }
]
db.collection
id: 1, other_key_and_values...
id: 2, other_key_and_values...
id: 3, other_key_and_values...

I want to apply a batch update process to the above db collections.

The contents of the db collection after updating are as follows

id: 1, other_key_and_values..., to_value_key: 'x'
id: 2, other_key_and_values..., to_value_key: 'y'
id: 3, other_key_and_values... #The id: 3 does not exist in the dictionary list side, so the setting of to_value_key does not occur.
#Ignore id: 4 because it exists only in the dictionary list.

I think it is possible to create multiple UpdateOne and use bulk_write, but is it possible to use, for example, $lookup, $unwind, $merge, etc. in the aggregation pipline to perform the update process in batches?


Solution

  • Assuming id field is unique and there is a unique index built around it, you can put your documents in a $documents stage and $merge into the collection.

    db.aggregate([
      {
        "$documents": [
          // your python dict here
          {
            "id": 1,
            "from_value_key": "x"
          },
          {
            "id": 2,
            "from_value_key": "y"
          },
          {
            "id": 4,
            "from_value_key": "z"
          }
        ]
      },
      {
        "$merge": {
          "into": "collection",
          "on": "id",
          "whenMatched": "merge"
        }
      }
    ])
    

    Mongo Playground

    ^ This won't work, as playground is yet to support $documents. It is just to demonstrate the syntax.