Search code examples
mongodbpipelinebson

Link each element of array in a document to the corresponding element in an array of another document with MongoDB


Using MongoDB 4.2 and MongoDB Atlas to test aggregation pipelines.

I've got this products collection, containing documents with this schema:

 {
    "name": "TestProduct",
    "relatedList": [
      {id:ObjectId("someId")},
      {id:ObjectId("anotherId")}
    ]
 }

Then there's this cities collection, containing documents with this schema :

{
        "name": "TestCity",
        "instructionList": [
          { related_id: ObjectId("anotherId"), foo: bar},
          { related_id: ObjectId("someId"), foo: bar}
          { related_id: ObjectId("notUsefulId"), foo: bar}
          ...
        ]
 }

My objective is to join both collections to output something like this (the operation is picking each related object from the instructionList in the city document to put it into the relatedList of the product document) :

{
        "name": "TestProduct",
        "relatedList": [
          { related_id: ObjectId("someId"), foo: bar},
          { related_id: ObjectId("anotherId"), foo: bar},
        ]
}

I tried using the $lookup operator for aggregation like this :

$lookup:{
  from: 'cities',
  let: {rId:'$relatedList._id'},
  pipeline: [
         {
           $match: {
             $expr: {
               $eq: ["$instructionList.related_id", "$$rId"]
             }
           }
         },
  ]
}

But it's not working, I'm a bit lost with this complex pipeline syntax.

Edit

By using unwind on both arrays :

    { 
         {$unwind: "$relatedList"},
         {$lookup:{
             from: "cities",
             let: { "rId": "$relatedList.id" },
             pipeline: [
        
                {$unwind:"$instructionList"},
                {$match:{$expr:{$eq:["$instructionList.related_id","$$rId"]}}},

             ],
             as:"instructionList",
         }},

         {$group: {
             _id: "$_id",
             instructionList: {$addToSet:"$instructionList"}

          }}
}

I am able to achieve what I want, however, I'm not getting a clean result at all :

{
 "name": "TestProduct",
 instructionList: [
    [
      {
        "name": "TestCity",
        "instructionList": {
         "related_id":ObjectId("someId")
        }
      }
    ],
    [
      {
        "name": "TestCity",
        "instructionList": {
         "related_id":ObjectId("anotherId")
        }
      }
    ]
 ]
}

How can I group everything to be as clean as stated for my original question ? Again, I'm completely lost with the Aggregation framework.


Solution

  • the operation is picking each related object from the instructionList in the city document to put it into the relatedList of the product document)

    Given an example document on cities collection:

    {"_id": ObjectId("5e4a22a08c54c8e2380b853b"),
      "name": "TestCity",
      "instructionList": [
        {"related_id": "a", "foo": "x"},
        {"related_id": "b", "foo": "y"},
        {"related_id": "c", "foo": "z"}
    ]}
    

    and an example document on products collection:

    {"_id": ObjectId("5e45cdd8e8d44a31a432a981"),
      "name": "TestProduct",
      "relatedList": [
        {"id": "a"},
        {"id": "b"}
    ]}
    

    You can achieve try using the following aggregation pipeline:

    db.products.aggregate([
        {"$lookup":{
            "from": "cities", 
            "let": { "rId": "$relatedList.id" }, 
            "pipeline": [
                {"$unwind":"$instructionList"},
                {"$match":{
                    "$expr":{
                        "$in":["$instructionList.related_id", "$$rId"]
                    }
                }
            }], 
            "as":"relatedList",
        }}, 
        {"$project":{
            "name":"$name",
            "relatedList":{
                "$map":{
                    "input":"$relatedList",
                    "as":"x",
                    "in":{
                        "related_id":"$$x.instructionList.related_id",
                        "foo":"$$x.instructionList.foo"
                    }                
                }
            }
        }}
    ]);
    

    To get a result as the following:

    {  "_id": ObjectId("5e45cdd8e8d44a31a432a981"),
       "name": "TestProduct",
       "relatedList": [
              {"related_id": "a", "foo": "x"},
              {"related_id": "b", "foo": "y"}
    ]}
    

    The above is tested in MongoDB v4.2.x.

    But it's not working, I'm a bit lost with this complex pipeline syntax.

    The reason why it's slightly complex here is because you have an array relatedList and also an array of subdocuments instructionList. When you refer to instructionList.related_id (which could mean multiple values) with $eq operator, the pipeline doesn't know which one to match.

    In the pipeline above, I've added $unwind stage to turn instructionList into multiple single documents. Afterward, using $in to express a match of single value of instructionList.related_id in array relatedList.