In my database I have activities performed in different periods (start in a day and finish another day). This is an example of a document:
{
"activity" : "test",
"period" : [ ISODate("2020-01-01T00:00:00Z"), ISODate("2020-02-10T00:00:00Z")]
}
The "period" is an array of 2 elements, the first is the start time and the second is the end.
I would like to write a query which gives me in output the number of activities in the time.
For better understating, considering this 2 activities:
{
"activity" : "first_activity",
"period" : [ ISODate("2020-01-01T00:00:00Z"),ISODate("2020-02-10T00:00:00Z")]
},
{
"activity" : "second_activity",
"period" : [ ISODate("2020-01-13T00:00:00Z"), ISODate("2020-01-26T00:00:00Z")]
}
My goal is an output which allows me to draw a graph of couples ( date , #activities ), something like this:
{
"quantity" : 1,
"date" : ISODate("2020-01-01T00:00:00Z")
},
{
"quantity" : 2,
"date" : ISODate("2020-01-13T00:00:00Z")
},
{
"quantity" : 1,
"date" : ISODate("2020-01-26T00:00:00Z")
},
{
"quantity" : 0,
"date" : ISODate("2020-02-10T00:00:00Z")
}
Unfortunately I didn't reach the final result yet... and probably I'm writing a query not in a "very nice way"... My logic flow is the following:
So, by writing this query:
[
{ $unwind : {
path: "$period",
includeArrayIndex: "arrayIndex"
}
},
{
$project: { _id:0, date: "$period", val: { $cond: { if : { $lt: ["$arrayIndex",1] } , then: 1 , else: -1 } }}
},
{
$sort : { date : 1}
},
{
"$group":{
_id : { $dateToString: { format: "%Y-%m-%d", date: "$date" } },
delta: { $sum: "$val" }
}
}
]
I could reach the result where I have the list of days when the number of activities changed together with the number of how much it changed (delta).
Something like this can be the output:
{ _id : "2020-09-30", delta: 3 }
{ _id : "2020-11-04", delta: -2 }
{ _id : "2020-12-01", delta: 1 }
But my goal is to have a "cumulative sum" so a result like this:
{ _id : "2020-09-30", count: 3 }
{ _id : "2020-11-04", count: 1 }
{ _id : "2020-12-01", count: 2 }
What kind of additional query I should develop?
At the end I solved in a way which sound too much complex... I think that the query is too slow. I had to save 2 array with all the delta
values and one with all the date
. Then I could play doing the sum until the specific position (thanks to the arrayindex saved on the second $unwind
).
So to the query insert in the question I added the following:
{
"$group":{
"_id":0,
"time":{ "$push":"$_id" },
"totals":{ "$push":"$delta" }
}
},
{
"$unwind":{ "path":"$time", "includeArrayIndex":"index" }
},
{
"$project":{
"_id":0,
"time":1,
"delta":{ "$arrayElemAt":[ "$totals", "$index" ] },
"total":{ "$sum":{ "$slice":[ "$totals", { "$add":[ "$index", 1 ] } ] } }
}
}