Search code examples
mongodbpipelineaggregation-frameworktop-nsubdocument

Mongo query against subdocument with a top N


I've got the following data in a user collection in MongoDB:

{
        "_id" : ObjectId("53807beee4b0d2b25747df7e"),
        "allowedAppIds" : [ 
            {
                "appId" : "534c8d2ce4b024b1f08e16e6",
                "createdDate" : ISODate("2014-06-21T20:22:11.945Z"),
            }, 
            {
                "appId" : "53585d97e4b0769f79d1e140",
                "createdDate" : ISODate("2014-06-21T21:03:56.045Z"),
            }, 
            {
                "appId" : "53490f2ae4b024b1f08e0c39",
                "createdDate" : ISODate("2014-06-26T22:36:21.855Z"),
            }, 
            {
                "appId" : "537e3ad7e4b0a8e6dd5cf20b",
                "createdDate" : ISODate("2014-06-28T19:44:43.805Z"),
            }
        ],
        "createdDate" : ISODate("2014-05-24T11:01:02.835Z"),
    }, 
    {
        "_id" : ObjectId("53aca073e4b00bff4ee85f6d"),
        "allowedAppIds" : [ 
            {
                "appId" : "536927eee4b005b056353a8a",
                "createdDate" : ISODate("2014-06-26T22:36:57.681Z"),
            }
        ],
        "createdDate" : ISODate("2014-06-26T22:36:35.767Z"),
    }, 
    {
        "_id" : ObjectId("53aca12fe4b00bff4ee85fd8"),
        "allowedAppIds" : [ 
            {
                "appId" : "537e3ad7e4b0a8e6dd5cf20b",
                "createdDate" : ISODate("2014-06-27T06:33:32.728Z"),
            }
        ],
        "createdDate" : ISODate("2014-06-26T22:39:43.726Z"),
    }, 
    {
        "_id" : ObjectId("53ac9f86e4b078bf3022fe16"),
        "allowedAppIds" : [ 
            {
                "appId" : "537e3ad7e4b0a8e6dd5cf20b",
                "createdDate" : ISODate("2014-06-26T22:32:49.655Z"),
            }, 
            {
                "appId" : "534c8d2ce4b024b1f08e16e6",
                "createdDate" : ISODate("2014-06-26T22:42:38.011Z"),
            }
        ],
        "createdDate" : ISODate("2014-06-26T22:32:38.806Z"),
    } 

I'm trying to get a count of the first appIds for the entire collection. So for each user, I need to get their very first allowedAppId, the group by that appId and get a count for the entire collection.

For the data above, I'm trying to get this:

appId:  534c8d2ce4b024b1f08e16e6 count: 1
appId:  536927eee4b005b056353a8a count: 1
appId:  537e3ad7e4b0a8e6dd5cf20b count: 2

Once I get that working, I'm going to try putting date parameters around the created date in the allowedAppIds.createdDate to further filter by dates.

I'm still fairly new to Mongo, so if this is basic I apologize. I've seen lots of examples on google, etc, but haven't been able to find anything that will use the aggregation pipeline to get me a top 1 record from a subdocument.

I did get far enough to get a filtering of records that actually have allowedAppIds (NOT all user documents have an allowedAppId subdocument) by doing this in my pipeline:

db.user.aggregate({$match:{"allowedAppIds.0": {$exists: true}}})

but I'm still not able to just get the first appId with it's associated createdDate back out from there.


Solution

  • You can do this by unwinding the allowedAppIds field and then grouping twice:

    db.test.aggregate([
        // Duplicate the docs, one per allowedAppIds element
        {$unwind: '$allowedAppIds'},
        // Group them back by _id, but just take the first element from each _id set
        {$group: {_id: '$_id', appId: {$first: '$allowedAppIds.appId'}}},
        // Now group by appId and count
        {$group: {_id: '$appId', count: {$sum: 1}}}
    ])
    

    Output:

    {
        "result" : [ 
            {
                "_id" : "534c8d2ce4b024b1f08e16e6",
                "count" : 1
            }, 
            {
                "_id" : "536927eee4b005b056353a8a",
                "count" : 1
            }, 
            {
                "_id" : "537e3ad7e4b0a8e6dd5cf20b",
                "count" : 2
            }
        ],
        "ok" : 1
    }