I have a given MongoDb collection with documents which look similar to this:
{
items: [
{ id: 1, data: 'a' },
{ id: 2, data: 'b' },
{ id: 3, data: 'c' },
]
}
Now I have to $lookup
from this collection. I need the array data from the element with id = 2 where id = 1 data matches the lookup value $$value
. My first naive try to find the correct document looked like this:
$match {
items: { $elemMatch: {
id: 1,
data: '$$value'
} }
}
But the $$value
is not evaluated, so the data is compared to the literal instead of its value. I also tried to evaluate the value using $expr
, but I wasn't able to get the syntax right (if possible at all?).
The only working way I was able to get working was by extracting the data first and do the matching afterwards:
[
{
$project: {
_id: 0,
data1: {
$filter: {
input: '$items', as: 'item',
cond: {$eq: ['$$item.id', 1]}
}
},
data2: {
$filter: {
input: '$items', as: 'item',
cond: {$eq: ['$$item.id', 2]}
}
},
}
},
{
$set: {
data1: {$arrayElemAt: ['$data1.data', 0]},
data2: {$arrayElemAt: ['$data2.data', 0]}
}
},
{
$match: {
$expr: {$eq: ['$data1', '$$value']}
}
}
];
But as one might expect, this approach is much slower. For the data involved the query took over 7s, whereas the upper approach (using a constant instead of a variable $$value
) over 3x as fast.
Is it possible to use the variable $$value
directly in the $elemMatch
operator? Or is there any other optimization available to speed-up the collection lookup?
So first let's understand why your naive approach failed, Mongo's pipelined lookup docs specify that:
A $match stage requires the use of an $expr operator to access the variables. $expr allows the use of aggregation expressions inside of the $match syntax.
So the variable value
that you are defining in the start of the $lookup
is only accessible with the usage of $expr, now also specified in the $expr
docs:
The arguments can be any valid aggregation expression
Sadly for us $elemMatch is not a "aggregation expression" as it belongs to the "query language". this is why your first approach failed. You are just not allowed to use $elemMatch
within an $expr
which is required in order to access the $$value
field.
So what can we do?, well you can just use $filter as you already started doing:
db.collection.aggregate([
{
$lookup: {
from: "collection2",
let: {
value: "$data"
},
pipeline: [
{
$match: {
$expr: {
$gt: [
{
$size: {
$filter: {
input: "$items",
as: "item",
cond: {
$and: [
{
$eq: [
"$$item.id",
1
]
},
{
$eq: [
"$$item.data",
"$$value"
]
}
]
}
}
}
},
0
],
}
}
}
],
as: "res"
}
}
])
It would be easier to answer fully if you provide the full structure of both collections and the required end result as I had to guess some of the stuff here in order to write this pipeline.