Search code examples
mongodbmongodb-queryprojectionaggregation-framework

Remove all but one field from a nested document via projection


Here's my sample document:

{
  a: 42,
  map: {
    x: { ... },
    y: { ... },
    z: { ... }
  }
}

I'm looking for a way to return only:

{
  a: 42,
  map: {
    y: { ... }
  }
}

I only want to specify the field inside the map to keep. Something like that...

db.myCollection.find({},
  {
    "map.y":1
  }
)

... BUT it should also return field a without explicitly specifying it.

I know I could also remove the unwanted map entries:

db.myCollection.find({},
  {
    "map.x":0,
    "map.z":0
  }
)

But this way, I need to know before I run the query which map keys are available.

Is there a nice way to handle this? Maybe using aggregation framework?

Thank you :)


Solution

  • This is basically how projection or field selection works in MongoDB as well as many other database systems. The concept here is basically "all or nothing", in that if you do not specify all of the fields you want to return they are not returned, with of course the default being to return everything where you don't say specifically what you want.

    So the correct form on your current data to just return a and map.y is :

    db.myCollection.find({}, { "a": 1, "map.y": 1 })
    

    Or of course you tell it what you don't want:

    db.myCollection.find({}, { "map.x": 0, "map.z": 0 })
    

    But you cannot "mix" inclusion and exclusion, with the only exception being the _id field where you don't want that in your results, possibly thinking "covered index" query, but usually you want the primary key:

    db.myCollection.find({}, { "_id": 0, "a": 1, "map.y": 1 })
    

    As for the aggregation framework, it is just the same and is particularly and equally as brutal with operators such as $project and $group. In both of those cases you need to specify exactly what you want to return ( $project follows the general projection rules as well ) or those fields are not present in results.

    In fact, a common mistake made my people is "removing" fields with one of these stages and then trying to refer to the field that was removed later. It is a "pipeline", just like Unix pipe | and the only content that flows to the next stage is what you specify:

    The only "real" case where you can just exclude all but a matching field without specifying others is by changing your structure to implement "map" as an array and then using the $redact pipeline stage, available from MongoDB 2.6 and upwards. Albeit a little contrived:

    db.test.insert({
        "a": 42,
        "map": [
            { "type": "x", "content": {} },
            { "type": "y", "content": {} },
            { "type": "z", "content": {} }
        ]
    })
    

    And the aggregation operation on data structured like that:

    db.test.aggregate([
        { "$redact": {
            "$cond": [
                { "$eq": [
                    { "$ifNull": [ "$type", "y" ] },
                    "y"
                ]},
                "$$DESCEND",
                "$$PRUNE"
            ]
        }}
    ])
    

    So there we only asked for matching the element with "type" equal to "y", but you generally need to be careful here as $redact is processing recursively, which is why the $ifNull operator artificially creates a match at levels where the tested field does not exist.

    But generally, projection is all or nothing. Specify the fields you want to return or they will not be there.