Search code examples
mongodbmongodb-query

Extracting unique objects using mongoldb


I attempting to build a query for MongoDB that returns all unique values from an array within a document, by a key value (batch_id). I have been successful in doing this, grouping the unique batch_ids, however I'm unable to extract the corresponding key value (batchName) and count the number of objects that had the unique values per array.

How do I extract batchName and count the number objects? I've tried a few things, however I think i'm having trouble with $addToSet.

MongoPlayground: https://mongoplayground.net/p/j6Jkeh79Zqq

My current output is:

[
  {
    "_id": "6032a5ad80443334a35f2232",
    "batchData": [
      {
        "batch_id": "6032a5af80443334a35f2267TB"
      },
      {
        "batch_id": "6032a5af80443334a35f26467JC"
      },
      {
        "batch_id": "6032a5af80443334a5f74367UI"
      }
    ]
  }
]

My Expected Output is

    [
      {
        "_id": "6032a5ad80443334a35f2232",
        "batchData": [
          {
            "batch_id": "6032a5af80443334a35f2267TB",
            "batchName": "Test 2",
            "count": 2
          },
          {
            "batch_id": "6032a5af80443334a35f26467JC",
            "batchName": "Test 3",
            "count": 1
          },
          {
            "batch_id": "6032a5af80443334a5f74367UI",
            "batchName": "Test 1",
            "count": 2
          }
        ]
      }
    ]

Example Document:

[
  {
    "_id": "6032a5ad80443334a35f2232",
    "List": [
      {
        "_id": "6032a5af80443334a35f2234",
        "batch_id": "6032a5af80443334a5f74367UI",
        "batchName": "Test 1",
        
      },
      {
        "_id": "6032a5b080443334a35f2236",
        "batch_id": "6032a5af80443334a5f74367UI",
        "batchName": "Test 1",
        
      },
      {
        "_id": "6032a5b080443334a35f2238",
        "batch_id": "6032a5af80443334a35f2267TB",
        "batchName": "Test 2",
        
      },
      {
        "_id": "6032a5b080443334a35f2239",
        "batch_id": "6032a5af80443334a35f2267TB",
        "batchName": "Test 2",
        
      },
      {
        "_id": "6032a5b080443334a35f2234",
        "batch_id": "6032a5af80443334a35f26467JC",
        "batchName": "Test 3",
        
      }
    ]
  }
]

Mongo Query:

db.collection.aggregate([
  {
    $project: {
      batchData: {
        batch_id: "$List.batch_id",
        
      },
      
    },
    
  },
  {
    $unwind: "$batchData.batch_id"
  },
  {
    $group: {
      _id: "$_id",
      batchData: {
        $addToSet: "$batchData"
      },
      
    },
    
  },
  
])

Solution

  • You can first $unwind to flatten the array for easier processing. After that, you can $group the data again and use $first to get the batchName. Finally, you can do another $group to format to your expected format.

    db.collection.aggregate([
      {
        "$unwind": "$List"
      },
      {
        $group: {
          _id: {
            _id: "$_id",
            batch_id: "$List.batch_id"
          },
          batchName: {
            $first: "$List.batchName"
          },
          count: {
            $sum: 1
          }
        }
      },
      {
        $group: {
          _id: "$_id._id",
          batchData: {
            $push: {
              "batch_id": "$_id.batch_id",
              "batchName": "$batchName",
              "count": "$count"
            }
          }
        }
      }
    ])
    

    Here is the Mongo playground for your reference.