Search code examples
mongodbaggregation-frameworkevent-log

MongoDB: conditional updates to array fields in a single document


Context: I have created a service that deals with users data. I can not modify source and order of data in upstream system (usually it is a partial snapshot of a user or a delta (e.g. user's new post). I also must deal with duplicated information. I modeled my persistent layer around document db (MongoDB) using a single 'users' collection because I deal with independent users that do not have any relation, I don't need to update multiple tables and my read model is similar to my document's schema. I can always identify a user by his/her _id and login so whenever I need to update user's info I can update just a single document. Moreover usually I don't want to override user's data, but rather append it so a document creates a user's log. I just update it in the very rare case (when the data is nullable but will be updated only once (details) or I don't care about intermediate state (details.verified).

Problem: Whenever I receive an info about a given user, after matching it I want to check if the user should be modified and update it in a single atomic query/update. The problem is I may have several fields or array entries to check.

In a given example I want to:

  1. match a user with _id 123
  2. add location entry to locations array only if in the last one location differs than the the one in the query.
  3. A profile is very complex, thus I introduced a hash. I want to add a new profile entry only if the last hash differs than the one provided in the query.
  4. Add a new post to posts array only if it does not contain a post with given postId.

Another problem is that I can have e.g. just an information about a new post (that can also be a duplicate) for a given user, but it doesn't exist yet thus I need to upsert it.

So I want to cover few use cases that will update user's data in a single query, e.g.:

  • a user not exist and we add it's details
  • we append a new post that is not yet persisted
  • we are trying to append a post that was already saved
  • we append a new post and check if location hasn't changed (and update it if needed)
  • we check if latest user's profile has changed by comparing hash and update it if needed and at the same time add info to user's details.

Updates based on a single criteria seems to be strait forward. Without aggregation I can update a document with only one match criteria. Aggregation pipeline can contain several stages, but I am not certain I can use push stage to append to an array if certain criteria is met without a group stage. Upsert with conditional push is also problematic. I end up by ignoring DuplicateKeyException (so far without aggregation, because my service's logic relay on information if we deal with a duplicate). Maybe you know a better solution.

https://www.mongodb.com/docs/manual/reference/operator/aggregation/push/

A question if this approach is doable? I want to avoid replacing a document. I would be appreciate for some guideline and/or example queries.

A simplified document:

[
  {
    "_id": 123,
    "login": "john",
    "details": {
      "a": "abc",
      "verified": true
    },
    "locations": [
      {
        "location": {
          "country": "UK",
          "region": "foo",
          "city": "bar"
        },
        "createdAt": ISODate("2020-01-22T21:21:41.052Z")
      }
    ],
    "profiles": [
      {
        "preference": {
          "a": "foo",
          "b": "bar"
        },
        "details": {
          "a": "foo",
          "b": "bar"
        },
        "hash": "hash123",
        "createdAt": ISODate("2020-01-22T21:21:41.052Z")
      }
    ],
    "posts": [
      {
        "postId": 234,
        "message": "foo bar",
        "createdAt": ISODate("2020-01-22T21:21:41.052Z")
      },
      {
        "postId": 345,
        "message": "bar foo",
        "createdAt": ISODate("2020-01-22T21:21:41.052Z")
      }
    ]
  }
]

https://mongoplayground.net/p/UsS72apI23F


Solution

  • Your 3 updates of locations, profiles, and posts shared the same update pattern and can be solved with the below update:

    1. assign a variable of your input with $let
    2. perform conditional checking with $cond; update or leave the current array as-is based on the result. If update is needed, $concatArrays with current array and a single-sized array which holds your input variable.

    It may seem abstract so here is a concrete example of locations

    {
        "$set": {
          "locations": {
            "$let": {
              "vars": {
                // your location input here
                "locationInput": {
                  "country": "TEST",
                  "region": "test",
                  "city": "test"
                }
              },
              "in": {
                "$cond": {
                  "if": {
                    $ne: [
                      "$$locationInput",
                      // last location
                      {
                        $last: "$locations.location"
                      }
                    ]
                  },
                  "then": {
                    "$concatArrays": [
                      "$locations",
                      [
                        "$$locationInput"
                      ]
                    ]
                  },
                  "else": "$locations"
                }
              }
            }
          }
        }
      }
    

    Then just repeat the $set 3 times for 3 cases. In the playground, they are separated for clarity/readability. You can put them into a single $set if preferred.

    Mongo Playground


    EDIT: to handle the case in which the document does not exist and you want to insert it. You can $unionWith the "template" document you want to ensure at least one document for that _id exists. Then use $first to pick the first document. If the document already exists, it would be the first document and it would be picked. If it does not exist, the template document will be picked. Then you can continue with the update above. Finally, use $merge to update to the collection.

    {
        "$match": {
          "_id": 123
        }
      },
      {
        "$unionWith": {
          "coll": "collection",
          "pipeline": [
            {
              $documents: [
                // put the "template" you want to insert here
                {
                  "_id": 123,
                  "login": "john",
                  "details": {
                    "a": "abc",
                    "verified": true
                  },
                  "locations": [],
                  "profiles": [],
                  "posts": []
                }
              ]
            }
          ]
        }
      },
      {
        $group: {
          _id: "$_id",
          // if the user exists, it will be the first document picked
          doc: {
            $first: "$$ROOT"
          }
        }
      },
      {
        "$replaceRoot": {
          "newRoot": "$doc"
        }
      },
      ... // remaining updates above
      {
        "$merge": {
          "into": "collection",
          "on": "_id"
        }
      }
    

    Mongo Playground when no document exists