I have a collection of documents that each contain arrays. the output i want is the unwinded array elements filtered down by a certain criteria.
i can do this by doing:
db.collection.aggregate([
{
$unwind: "$myArrayField"
},
{
$match: {
"myArrayField.myCriteriaField": "myValue"
}
}
])
but it seems wasteful to unwind everything, so instead i do
db.collection.aggregate([
{
$match: {
"myArrayField.myCriteriaField": "myValue"
}
},
{
$unwind: "$myArrayField"
},
{
$match: {
"myArrayField.myCriteriaField": "myValue"
}
}
])
to filter down to viable document candidates, unwind those, and then filter down the unwound elements again since there will be elements that don't match.
alternatively, i can do
db.collection.aggregate([
{
$project: {
myArrayField: {
$filter: {
input: "$myArrayField",
as: "element",
cond: { $eq: ["$$element.myCriteriaField", "myValue"] }
}
}
}
},
{
$unwind: "$myArrayField"
}
])
or even
db.collection.aggregate([
{
$match: {
"myArrayField.myCriteriaField": "myValue"
}
},
{
$project: {
myArrayField: {
$filter: {
input: "$myArrayField",
as: "element",
cond: { $eq: ["$$element.myCriteriaField", "myValue"] }
}
}
}
},
{
$unwind: "$myArrayField"
}
])
which one is the most efficient? i don't understand enough about how the aggregations are run to know what variables it could depend on (collection size, document size, etc).
lastly, the match unwind match seems pretty straightforward but it also just feels wrong to have to do it like that so i'm wondering if i'm missing something.
First, You must sure have an index on the field. Using Explain
to check index hit with $match
first stage.
Now, we explain your 2 suggestion: $project
$filter
$unwind
vs $match
$project
$filter
$unwind
:
$project
$filter
$unwind
: with first stage $project
, mongodb is doing a collection scan, outputting ALL documents
in that collection with that field. Now, It's a full collection scan
and slow
$match
$project
$filter
$unwind
: with first stage $match
, mongodb is doing an index scan instead of a collection if It hit an index. After that, stage $project
will do on a limit document
that matched. Now, it a index scan
and faster
because mongodb only do on some document.
Final, with your 2 option, the second is better