Search code examples
mongodbmongodb-queryaggregation-frameworkaggregation

Group by month and year is not working mongodb


Here is the query,

db.getCollection('_ad.insight').aggregate([
    {
        $match:{
            date: {
                $lte: ISODate('2018-12-31'),
                $gte: ISODate('2017-01-01')
            }
        }
    },
    {
        $project:{
            _id: 0,
            year: {$year: "$date"},
            month: {$month: "$date"},
            day: {$dayOfMonth: "$date"},
            date: 1,
            clicks: 1,
            reach: 1 
        }
    },
    {
        $group:{
            _id: {
                month: "$month",
                year: "$year"
            },
            clicks: {$sum: "$clicks"},
            reach: {$sum: "$reach"},
            date: {$addToSet: "$date"}
        }
    },
    {
        $project:{
            _id: 0,
            month: "$_id.month",
            year: "$_id.year",
            clicks: 1,
            reach: 1,
            date: 1
        }
    }
]);

And the response I am getting,

/* 1 */
{
    "clicks" : 1096,
    "reach" : 33288,
    "date" : [
        ISODate("2018-01-01T00:00:00.000+05:00"),
        ISODate("2017-12-31T00:00:00.000+05:00"),
        ISODate("2017-12-28T00:00:00.000+05:00"),
        ISODate("2017-12-26T00:00:00.000+05:00"),
        ISODate("2017-12-24T00:00:00.000+05:00"),
        ISODate("2017-12-23T00:00:00.000+05:00"),
        ISODate("2017-12-25T00:00:00.000+05:00"),
        ISODate("2017-12-29T00:00:00.000+05:00"),
        ISODate("2017-12-22T00:00:00.000+05:00"),
        ISODate("2017-12-21T00:00:00.000+05:00"),
        ISODate("2017-12-30T00:00:00.000+05:00"),
        ISODate("2017-12-20T00:00:00.000+05:00"),
        ISODate("2017-12-27T00:00:00.000+05:00")
    ],
    "month" : 12,
    "year" : 2017
},

/* 2 */
{
    "clicks" : 1629,
    "reach" : 98113,
    "date" : [
        ISODate("2018-01-05T00:00:00.000+05:00"),
        ISODate("2018-01-04T00:00:00.000+05:00"),
        ISODate("2018-01-03T00:00:00.000+05:00"),
        ISODate("2018-01-07T00:00:00.000+05:00"),
        ISODate("2018-01-08T00:00:00.000+05:00"),
        ISODate("2018-01-02T00:00:00.000+05:00"),
        ISODate("2018-01-06T00:00:00.000+05:00")
    ],
    "month" : 1,
    "year" : 2018
}

Sample Collection: Its a flat structure, contains around 400 fields, but i am showing only those which i am using in query.

{
    "_id" : ObjectId("5akjbrd51f193455adtrf6fc"),
    "clicks" : 5,
    "reach" : 10 
    "date" : ISODate("2018-01-06T00:00:00.000+05:00"),
    "post_engagement" : 127,
    "post_reactions" : 1,
    "post_shares" : 0,
    "qualificationfailed" : 0,
    "qualificationfailed_conversion_value" : 0
}

Desired Output:

/* 1 */
{
    "clicks" : 1096,
    "reach" : 33288,
    "date" : [
        ISODate("2018-01-01T00:00:00.000+05:00"),//this shouldn't be here
        ISODate("2017-12-31T00:00:00.000+05:00"),
        ISODate("2017-12-28T00:00:00.000+05:00"),
        ISODate("2017-12-26T00:00:00.000+05:00"),
        ISODate("2017-12-24T00:00:00.000+05:00"),
        ISODate("2017-12-23T00:00:00.000+05:00"),
        ISODate("2017-12-25T00:00:00.000+05:00"),
        ISODate("2017-12-29T00:00:00.000+05:00"),
        ISODate("2017-12-22T00:00:00.000+05:00"),
        ISODate("2017-12-21T00:00:00.000+05:00"),
        ISODate("2017-12-30T00:00:00.000+05:00"),
        ISODate("2017-12-20T00:00:00.000+05:00"),
        ISODate("2017-12-27T00:00:00.000+05:00")
    ],
    "month" : 12,
    "year" : 2017
},

/* 2 */
{
    "clicks" : 1629,
    "reach" : 98113,
    "date" : [
        // ISODate("2018-01-01T00:00:00.000+05:00") this should be in this group
        ISODate("2018-01-05T00:00:00.000+05:00"),
        ISODate("2018-01-04T00:00:00.000+05:00"),
        ISODate("2018-01-03T00:00:00.000+05:00"),
        ISODate("2018-01-07T00:00:00.000+05:00"),
        ISODate("2018-01-08T00:00:00.000+05:00"),
        ISODate("2018-01-02T00:00:00.000+05:00"),
        ISODate("2018-01-06T00:00:00.000+05:00")
    ],
    "month" : 1,
    "year" : 2018
}

The issue is, ISODate("2018-01-01T00:00:00.000+05:00") as you can see in output document 1, in date array the above mentioned date is on first index. It shows the "month" : 12 and "year" : 2017 as i am grouping by month and year. So my concern is, ISODate("2018-01-01T00:00:00.000+05:00") should belong to group number 2, that is 2 output document but its showing up in group 1.

I don't know what i am doing wrong, as its a simple pipeline. Please help!!


Solution

  • Take note that ISODate("2018-01-01T00:00:00.000+05:00") is in UTC + 5. This means that this entry has this date 2017-12-31T19:00:00 on UTC time.

    Mongo is grouping the dates according to UTC.

    You might want to check this post for dealing with different timezones How to aggregate by year-month-day on a different timezone