Search code examples
mongodbaggregation-frameworkaggregate

Match/filter/project nested fields from array in array in object


I have unsuccessfully attempted to extract some fields from nested array.

Here is the example document:

[
{
_id: 1,
"_a": [
  {
    "_p": [
      {
        _pubId: {
          "CC": "EN"
        },
        _prid: "2",
        s: {
          c: {
            text: [
              {
                secId: "4"
              },
              {
                secId: "5"
              }
            ]
          },
          d: {
            text: [
              {
                secId: "6"
              },
              {
                secId: "7"
              }
            ]
          }
        }
      },
      {
        _pubId: {
          "CC": "CZ"
        },
        _prid: "4",
        s: {
          c: {
            text: [
              {
                secId: "8"
              },
              {
                secId: "9"
              }
            ]
          }
        }
      }
    ]
  },
  {
    "_p": [
      {
        _pubId: {
          "CC": "CZ"
        },
        _prid: "200",
        s: {
          c: {
            text: [
              {
                secId: "4"
              },
              {
                secId: "5"
              }
            ]
          },
          d: {
            text: [
              {
                secId: "6"
              },
              {
                secId: "7"
              }
            ]
          }
        }
      },
      {
        _pubId: {
          "CC": "BG"
        },
        _prid: "4",
        s: {
          c: {
            text: [
              {
                secId: "8"
              },
              {
                secId: "9"
              }
            ]
          }
        }
      }
    ]
  }
 ]
}
]

And I need to extract all the "_a._p.s.[d|c].text.secId" only for "_a._p._pubId.CC":"EN" , the output need to look as follow:

_prid , secId
{ _prid:2 , secId:4 }
{ _prid:2 , secId:5 }
{ _prid:2 , secId:6 }
{ _prid:2 , secId:7 }

Example playground


Solution

  • See Below for updated aggregation pipeline based on comments from the OP.

    This is pretty ugly, and needs to be tested on a larger collection, but this seems to output what you want.

    db.collection.aggregate([
      {"$unwind": "$_a"},
      {"$unwind": "$_a._p"},
      {
        "$match": {
          "_a._p._pubId.CC": "EN"
        }
      },
      {
        "$group": {
          "_id": "$_a._p._prid",
          "pridOut": {
            "$push": {
              "$reduce": {
                "input": {"$objectToArray": "$_a._p.s"},
                "initialValue": [],
                "in": {
                  "$concatArrays": [
                    "$$value",
                    {
                      "$reduce": {
                        "input": "$$this.v.text",
                        "initialValue": [],
                        "in": {"$concatArrays": ["$$value", ["$$this"]]}
                      }
                    }
                  ]
                }
              }
            }
          }
        }
      },
      {
        "$set": {
          "pridOut": {
            "$reduce": {
              "input": "$pridOut",
              "initialValue": [],
              "in": {"$setUnion": ["$$value", "$$this"]}
            }
          }
        }
      },
      {
        "$set": {
          "pridOut": {
            "$map": {
              "input": "$pridOut",
              "as": "eachSecId",
              "in": {"$mergeObjects": [{"_prid": "$_id"}, "$$eachSecId"]}
            }
          }
        }
      },
      {"$unwind": "$pridOut"},
      {"$replaceWith": "$pridOut"}
    ])
    

    Output using your sample collection:

    [
      {"_prid": "2", "secId": "4"},
      {"_prid": "2", "secId": "5"},
      {"_prid": "2", "secId": "6"},
      {"_prid": "2", "secId": "7"}
    ]
    

    Try it on mongoplayground.net.

    This is Below :-)

    Avoiding "$unwind" (until the end), here's another aggregation pipeline to produce the desired output.

    db.collection.aggregate([
      {
        "$match": {
          "_a._p._pubId.CC": "EN"
        }
      },
      {
        "$project": {
          "_id": 0,
          "output": {
            "$reduce": {
              "input": {
                "$reduce": {
                  "input": {
                    "$reduce": {
                      "input": "$_a._p",
                      "initialValue": [],
                      "in": {"$concatArrays": ["$$value", "$$this"]}
                    }
                  },
                  "initialValue": [],
                  "in": {
                    "$concatArrays": [
                      "$$value",
                      {
                        "$cond": [
                          {"$eq": ["$$this._pubId.CC", "EN"]},
                          {
                            "$map": {
                              "input": {"$objectToArray": "$$this.s"},
                              "as": "s",
                              "in": {
                                "$map": {
                                  "input": "$$s.v.text",
                                  "as": "text",
                                  "in": {
                                    "$mergeObjects": [{"_prid": "$$this._prid"}, "$$text"]
                                  }
                                }
                              }
                            }
                          },
                          []
                        ]
                      }
                    ]
                  }
                }
              },
              "initialValue": [],
              "in": {"$concatArrays": ["$$value", "$$this"]}
            }
          }
        }
      },
      {"$unwind": "$output"},
      {"$replaceWith": "$output"}
    ])
    

    Try this on mongoplayground.net.