I have a collection made like this
[
{timestamp: xxx, type:'start'},
{timestamp: xxx, type:'log'},
{timestamp: xxx, type:'log'},
{timestamp: xxx, type:'log'},
{timestamp: xxx, type:'start'},
{timestamp: xxx, type:'log'},
{timestamp: xxx, type:'log'}
]
how do I aggregate the start event and the log event until but not including the next start?
the result should be something like this something like this
{
events:[
{timestamp: xxx, type:'start'},
{timestamp: xxx, type:'log'},
{timestamp: xxx, type:'log'},
{timestamp: xxx, type:'log'}
]
},
{
events:[
{timestamp: xxx, type:'start'},
{timestamp: xxx, type:'log'},
{timestamp: xxx, type:'log'}
]
}
Assumption: the timestamp field is "sortable" type
You can use $setWindowFields
to compute "grouping" for the events. The idea is to find the max timestamp with start
in the window range of [unbounded, current]
(i.e. find in the documents before and in current document with a $max
). After computing the "grouping", just do a simple $group
to put the events together in an array.
db.collection.aggregate([
{
"$setWindowFields": {
"sortBy": {
"timestamp": 1
},
"output": {
"grouping": {
"$max": {
"$cond": {
"if": {
"$eq": [
"$type",
"start"
]
},
"then": "$timestamp",
"else": -1
}
},
"window": {
"documents": [
"unbounded",
"current"
]
}
}
}
}
},
{
"$group": {
_id: "$grouping",
events: {
$push: {
timestamp: "$timestamp",
type: "$type"
}
}
}
}
])
P.S. Your expected result form may suffer from MongoDB 16MB document size if a single grouping contains too many records.