I have been writing an aggregation pipeline to show a summarized version of data from a collection.
Sample Structure of Document:
{
_id: 'abcxyz',
eventCode: 'EVENTCODE01',
eventName: 'SOMEEVENT',
units: 1,
rate: 2,
cost: 2,
distribution: [
{
startDate: 2021-05-31T04:00:00.000+00:00
units: 1
}
]
}
I have grouped it and merged the distribution into a single list with $unwind
step before $group
:
[
$unwind: {
path: '$distribution',
preserveNullAndEmptyArrays: false
},
$group: {
_id: {
eventName: '$eventName',
eventCode: '$eventCode'
},
totalUnits: {
$sum: '$units'
},
distributionList: {
$push: '$distribution'
},
perUnitRate: {
$avg: '$rate'
},
perUnitCost: {
$avg: '$cost'
}
}
]
Sample Output:
{
_id: {
eventName: 'EVENTNAME101'
eventCode: 'QQQ'
},
totalUnits: 7,
perUnitRate: 2,
perUnitCost: 2,
distributionList: [
{
startDate: 2021-05-31T04:00:00.000+00:00,
units: 1
},
{
startDate: 2021-05-31T04:00:00.000+00:00,
units: 1
},
{
startDate: 2021-06-07T04:00:00.000+00:00,
units: 1
}
]
}
I'm getting stuck at the next step; I want to consolidate the distributionList
into a new List with no repeating startDate
.
Example: Since first 2 objects of distributionList have the same startDate, it should be a single object in output with sum of units:
Expected:
{
_id: {
eventName: 'EVENTNAME101'
eventCode: 'QQQ'
},
totalUnits: 7,
perUnitRate: 2,
perUnitCost: 2,
newDistributionList: [
{
startDate: 2021-05-31T04:00:00.000+00:00,
units: 2 //units summed for first 2 objects
},
{
startDate: 2021-06-07T04:00:00.000+00:00,
units: 1
}
]
}
I couldn't use $unwind
or $bucket
as I intend to keep the grouping I did in previous steps ($group
).
Can I get suggestions or a different approach if this doesn't seem accurate?
You may want to do the first $group
at eventName, eventCode, distribution.startDate
level. Then, you can $group
again at eventName, eventCode
level and using $first
to keep your original $group
fields.
db.collection.aggregate([
{
$unwind: {
path: "$distribution",
preserveNullAndEmptyArrays: false
}
},
{
$group: {
_id: {
eventName: "$eventName",
eventCode: "$eventCode",
distributionStartDate: "$distribution.startDate"
},
totalUnits: {
$sum: "$units"
},
distributionUnitsSum: {
$sum: "$distribution.units"
},
perUnitRate: {
$avg: "$rate"
},
perUnitCost: {
$avg: "$cost"
}
}
},
{
$group: {
_id: {
eventName: "$_id.eventName",
eventCode: "$_id.eventCode"
},
totalUnits: {
$first: "$totalUnits"
},
newDistributionList: {
$push: {
startDate: "$_id.distributionStartDate",
units: "$distributionUnitsSum"
}
},
perUnitRate: {
$first: "$perUnitRate"
},
perUnitCost: {
$first: "$perUnitCost"
}
}
}
])
Here is the Mongo Playground to show the idea for your reference.