mongodb mongodb-query aggregation-framework

find first of documents for each distinct values for one field

With a collection of documents with fields field1, field2, field3 and so on, I need to find

distinct values for field3
for each distinct value of field3, need to get the first document with each distinct value in field3

For # 1, I could do db.myCollection.distinct("field3")

How do I go about #2 ?

Sample Collection:

[
    { "field1": 11, "field2": "toyota", "field3": "camry" },
    { "field1": 22, "field2": "toyota", "field3": "corolla" },
    { "field1": 33, "field2": "toyota", "field3": "camry" },
    { "field1": 44, "field2": "honda", "field3": "accord" },
    { "field1": 55, "field2": "honda", "field3": "accord" },
    { "field1": 66, "field2": "honda", "field3": "city" }
]

Desired result:

[
    { "field1": 11, "field2": "toyota", "field3": "camry" },
    { "field1": 22, "field2": "toyota", "field3": "corolla" },
    { "field1": 44, "field2": "honda", "field3": "accord" },
    { "field1": 66, "field2": "honda", "field3": "city" }
]

Solution

You need to run an aggregate operation that groups all the documents by field3 and use the $first accumulator with the $$ROOT system variable to bring the first document, something like the following:

db.myCollection.aggregate([
    {
        "$group": {
            "_id": "$field3",
            "doc": { "$first": "$$ROOT" }
        }
    }
])

or for an exact output:

db.myCollection.aggregate([
    {
        "$group": {
            "_id": "$field3",
            "field1": { "$first": "$field1" },
            "field2": { "$first": "$field2" }
        }
    },
    {
        "$project": {
            "_id": 0,
            "field3": "$_id",
            "field2": 1,
            "field1": 1
        }
    }
])