Search code examples
mongodbduplicatessubdocument

how to retrive all the document in sub document array and if there is any duplicates of field how to retrive the latest one?


i am having db structure like this below.

{
    "_id": ObjectId("53770b9de4b0ba6f4c976a27"),
    "source": [{
        "value": 5127,
        "createdAt": ISODate("2014-05-7T07:11:00Z"),
        "generated": ISODate("2014-05-17T07:23:00Z"),
    }, {
        "value": 5187,
        "createdAt": ISODate("2014-05-17T07:39:00Z"),
        "generated": ISODate("2014-05-17T07:40:00Z"),
    }, {
        "value": 5187,
        "createdAt": ISODate("2014-05-17T07:39:00Z"),
        "generated": ISODate("2014-05-17T07:41:00Z")
    }],
}

In this there is a duplicate in the subdocument array.I need to write the mongo db query to retrive all the sub - document and if there is any duplicates then based on the "generated" values i need to rerive the latest on like below.

{
    "_id": ObjectId("53770b9de4b0ba6f4c976a27"),
    "source": [{
        "value": 5127,
        "createdAt": ISODate("2014-05-17T07:11:00Z"),
    }, {
        "value": 5187,
        "createdAt": ISODate("2014-05-17T07:39:00Z"),
        "generated": ISODate("2014-05-17T07:41:00Z")

    }],

}

Is there any way to get the data like using mongo db query?


Solution

  • With aggregation framework you can finish the job.

    db.test.aggregate([
        {$unwind: '$source'},
        {$group: {_id: {value: "$source.value", createdAt: "$source.createdAt"}, generated: {$max: "$source.generated"}}}
    ]);
    

    Which gives you the result:

    { "_id" : { "value" : 5187, "createdAt" : ISODate("2014-05-17T07:39:00Z") }, "generated" : ISODate("2014-05-17T07:41:00Z") }
    { "_id" : { "value" : 5127, "createdAt" : ISODate("2014-05-07T07:11:00Z") }, "generated" : ISODate("2014-05-17T07:23:00Z") }
    

    A little bit different from what you want. But if you really want the format above, try this:

    db.test.aggregate([
        {$unwind: '$source'},
        {$group: {_id: {_id: "$_id", value: "$source.value", createdAt: "$source.createdAt"}, generated: {$max: "$source.generated"}}},
        {$group: {_id: "$_id._id", source: {$push: {value: "$_id.value", createdAt: "$_id.createdAt", generated: "$generated"}}}}
    ]);
    

    which gives you:

    {
        "_id": ObjectId("53770b9de4b0ba6f4c976a27"),
        "source": [{
            "value": 5187,
            "createdAt": ISODate("2014-05-17T07:39:00Z"),
            "generated": ISODate("2014-05-17T07:41:00Z")
        }, {
            "value": 5127,
            "createdAt": ISODate("2014-05-07T07:11:00Z"),
            "generated": ISODate("2014-05-17T07:23:00Z")
        }]
    }