I have a Mongo collection series where each document has a list with dataPoints. All series with the same testStepId contain the same amount of dataPoints:
{
"seriesId": {
"seriesId": "77678ca1-31db-4cec-a042-68a3053b92c6"
},
"testStepId": {
"testStepId": "c152415b-2392-4c2b-af74-51a4973bd257"
},
"measurement": {
"startTime": {
"$date": "2020-07-07T12:40:49.782Z"
},
"endTime": {
"$date": "2020-07-07T12:42:19.782Z"
}
},
"dataPoints": [
{
"timeStamp": {
"$date": "2020-07-07T12:41:09.782Z"
},
"value": "Value_1_1"
},
{
"timeStamp": {
"$date": "2020-07-07T12:41:29.782Z"
},
"value": "Value_1_2"
},
{
"timeStamp": {
"$date": "2020-07-07T12:41:39.782Z"
},
"value": "Value_1_3"
},
...
{
"timeStamp": {
"$date": "2020-07-07T12:42:19.782Z"
},
"value": "Value_2_11"
}
]
}
Now I want to query all series documents that match a specific testStepId (no problem). But instead of loading all dataPoints of all found series I want to load in sum only 1000 dataPoints. So in case of 10 found series I have to load only 100 dataPoints per series:
-> Load every (dataPoints.size() / 100)-th dataPoint
-> This means I have to consider the count of found series documents and the count of dataPoints in the series
-> Load every X-th dataPoint where
X = 1000 / <count of documents> / <count of dataPoints>
I'm trying hard to get this done via aggregation with the MongoDB Compass. But I'm still failing with counting the found documents and unsing this value ...
To start simple I just try to get every 2nd dataPoint:
{
project: {
dataPoints: {
$map: {
input: { $range: [ 0, {"$size": "$dataPoints"}, 2 ] },
as: "index",
in: { $arrayElemAt: [ "$dataPoints", "$$index" ] }
}
}
}
}
-> Works fine
Now I want to get every x-th 'dataPoint' dependend on the count of found documents. For this I tried some different ways, none of them works ...
{
project: {
dataPoints: {
$map: {
input: { $range: [ 0, {"$size": "$dataPoints"}, $count ] },
as: "index",
in: { $arrayElemAt: [ "$dataPoints", "$$index" ] }
}
}
}
}
-> "Project specification must be an object"
{
project: {
dataPoints: {
$let: {
vars: {
total: "$count",
},
in: {
$map: {
input: { $range: [ 0, {"$size": "$dataPoints"}, "$$total"] },
as: "index",
in: { $arrayElemAt: [ "$dataPoints", "$$index" ] }
}
}
}
}
}
}
-> "$range requires a numeric value step, found value of type:missing"
Obviously my approach is wrong. Could any body give me some hint how to get this working?
I think the formula for X
is X = <count of dataPoints> * <count of documents> / 1000
You can not directly access the number of documents (count) at a particular aggregation pipeline stage. However, you can combine all documents into a single document and count them, then expand them back into separate documents. You can achieve this by using $group
or $facet
.
I'll show an example with $group
[
{
$group: {
_id: null,
count: { $sum: 1 },
all: { $push: "$$ROOT" }
}
},
{
$unwind: "$all"
},
{
$replaceWith: { // $replaceWith is available from v4.2, for earlier version use { $replaceRoot: { newRoot: <doc> } }
$mergeObjects: [
"$all",
{
dataPoints: {
$map: {
input: {
$range: [
0,
{ $size: "$all.dataPoints" },
{
$ceil: {
$divide: [
{
$multiply: [
{ "$size": "$all.dataPoints" },
"$count"
]
},
1000
]
}
}
]
},
as: "index",
in: { $arrayElemAt: ["$all.dataPoints", "$$index"] }
}
}
}
]
}
}
]