Search code examples
mongodbmongodb-queryaggregation-framework

Group By Condition in MongoDB


I have a series of documents (check events) in MongoDB that look like this:

{
    "_id" : ObjectId("5397a78ab87523acb46f56"),
    "inspector_id" : ObjectId("5397997a02b8751dc5a5e8b1"),
    "status" : 'defect',
    "utc_timestamp" : ISODate("2014-06-11T00:49:14.109Z")
}

{
    "_id" : ObjectId("5397a78ab87523acb46f57"),
    "inspector_id" : ObjectId("5397997a02b8751dc5a5e8b2"),
    "status" : 'ok',
    "utc_timestamp" : ISODate("2014-06-11T00:49:14.109Z")
}

I need to get a result set that looks like this:

[
  {
    "date" : "2014-06-11",
    "defect_rate" : '.92' 
  },  
  {
    "date" : "2014-06-11",
    "defect_rate" : '.84' 
  }, 
]

In other words, I need to get the average defect rate per day. Is this possible?


Solution

  • The aggregation framework is what you want:

    db.collection.aggregate([
        { "$group": {
            "_id": {
                "year": { "$year": "$utc_timestamp" },
                "month": { "$month": "$utc_timestamp" },
                "day": { "$dayOfMonth": "$utc_timestamp" },
            },
            "defects": {
                "$sum": { "$cond": [
                    { "$eq": [ "$status", "defect" ] },
                    1,
                    0
                ]}
            },
            "totalCount": { "$sum": 1 }
        }},
        { "$project": {
            "defect_rate": {
                "$cond": [
                    { "$eq": [ "$defects", 0 ] },
                    0,
                    { "$divide": [ "$defects", "$totalCount" ] }
                ]
            }
        }}
    ])
    

    So first you group on the day using the date aggregation operators and get the totalCount of items on the given day. The use of the $cond operator here determines whether the "status" is actually a defect or not and the result is a conditional $sum where only the "defect" values are counted.

    Once those are grouped per day you simply $divide the result, with another check with $cond to make sure you are not dividing by zero.