Search code examples
mongodbmongodb-queryaggregation-framework

find first of documents for each distinct values for one field


With a collection of documents with fields field1, field2, field3 and so on, I need to find

  1. distinct values for field3
  2. for each distinct value of field3, need to get the first document with each distinct value in field3

For # 1, I could do db.myCollection.distinct("field3")

How do I go about #2 ?

Sample Collection:

[
    { "field1": 11, "field2": "toyota", "field3": "camry" },
    { "field1": 22, "field2": "toyota", "field3": "corolla" },
    { "field1": 33, "field2": "toyota", "field3": "camry" },
    { "field1": 44, "field2": "honda", "field3": "accord" },
    { "field1": 55, "field2": "honda", "field3": "accord" },
    { "field1": 66, "field2": "honda", "field3": "city" }
]

Desired result:

[
    { "field1": 11, "field2": "toyota", "field3": "camry" },
    { "field1": 22, "field2": "toyota", "field3": "corolla" },
    { "field1": 44, "field2": "honda", "field3": "accord" },
    { "field1": 66, "field2": "honda", "field3": "city" }
]

Solution

  • You need to run an aggregate operation that groups all the documents by field3 and use the $first accumulator with the $$ROOT system variable to bring the first document, something like the following:

    db.myCollection.aggregate([
        {
            "$group": {
                "_id": "$field3",
                "doc": { "$first": "$$ROOT" }
            }
        }
    ])
    

    or for an exact output:

    db.myCollection.aggregate([
        {
            "$group": {
                "_id": "$field3",
                "field1": { "$first": "$field1" },
                "field2": { "$first": "$field2" }
            }
        },
        {
            "$project": {
                "_id": 0,
                "field3": "$_id",
                "field2": 1,
                "field1": 1
            }
        }
    ])