I am joining a products collection on a productDetail collection where the product _id is the _id in the productDetail collection. I want to be able to find the products which are not on sale but exist in the productDetail table to include in my output as inProductDetailCollection for my reporting. I took a stab at it but it does not work. It always return true even when some _ids are not in the productDetail table. Here is my attempt. How can I get it to set the flag inProductDetailCollection to be false if missing or true if present
db.getCollection("products").aggregate([
{
"$match": {
"onSale": {
"$exists": false
}
}
},
{
"$lookup": {
"from": "productDetail",
"localField": "_id",
"foreignField": "_id",
"as": "productDetail"
}
},
{
"$match": {
"$expr": {
"$eq": [
"$productDetail",
[]
]
}
}
},
{
"$unwind": {
"path": "$productDetail",
"preserveNullAndEmptyArrays": true
}
},
{
"$addFields": {
"onSale": false
}
},
{
"$group": {
"_id": {
"productId": "$_id",
"onSale": "$onSale"
}
}
},
{
"$project": {
"productId": "$_id.productId",
"onSale": "$_id.onSale",
"inProductDetailCollection":
{
$cond: { if: { $gt: [ "_id", 0 ] }, then: true, else: false }
},
"_id": 0
}
}
]
)
Here is a sample of the data. One thing to note is products in the productDetail may not have _ids present in the productDetail table since this table gets uploaded by a nightly batch job. Also, products which are not on sale will have no 'onSale': 'Y' attribute. This is what the data looks like:
//products collection, some fields omitted for brevity
[
{
"_id": 123345,
"name": "NutraFast",
"description": "Supplement",
"price": 35.99
},
{
"_id": 13443,
"name": "BerryBlast",
"description": "Athletes Sports Drink",
"price": 12.99
},
{
"_id": 15644
"name": "MagnoPower-11",
"description": "Supplement",
"price": 45.99
}
,{
"_id": 17011
"name": "Zinc566",
"description": "Supplement",
"price": 25.99
},
{
"_id": 15011
"name": "VitaMax",
"description": "Supplement",
"price": 15.99
},
{
"_id": 15311
"name": "VitaMax",
"description": "Supplement",
"price": 15.99
},
{
"_id": 15316
"name": "Chlorphyl Cleanse",
"description": "Supplement",
"price": 55.99
}
]
Here is the productDetail collection sample
//productDetail collection
[
{
"_id": 123345,
"discount": 0.3,
"amount-per-unit": 50,
"inStock": "Y"
},
{
"_id": 13443,
"discount": 0.5,
"onSale": "Y",
"amount-per-unit": 60,
"inStock": "Y"
"onSale":"Y"
},
{
"_id" : 15644,
"discount": 0.5,
"onSale": "Y",
"amount-per-unit": 60,
"inStock": "Y"
},
{
"_id" : 15011,
"discount": 0.5,
"amount-per-unit": 60,
"inStock": "Y"
},
{
"_id" : 17011,
"discount": 0.5,
"amount-per-unit": 60,
"inStock": "Y"
}
]
It's very difficult to understand what you actually need. However, I have taken the liberty of correcting some invalid object structure in your sample documents and below are two possible aggregations that should hopefully get you the output you're looking for:
Example 1
This will return only the products
documents that have:
productDetail
productDetail
that includes the $lookup
documentinProductDetailCollectionButNotOnSale
true
only if the product
is not on salefalse
otherwisedb.getCollection("products").aggregate([
{
$lookup: {
from: "productDetail",
localField: "_id",
foreignField: "_id",
as: "productDetail"
}
},
{
$unwind: {
path: "$productDetail"
}
},
{
$set: {
inProductDetailCollectionButNotOnSale: {
$cond: [
{
$eq: [
"$productDetail.onSale",
"Y"
]
},
false,
true
]
}
}
}
])
See HERE for a working example.
Example 2
This will return all products
documents and includes:
productDetail
that includes the $lookup
document (if any)inProductDetailCollectionButNotOnSale
true
only if the product
has a matching productDetail
and is not on salefalse
otherwisedb.getCollection("products").aggregate([
{
$lookup: {
from: "productDetail",
localField: "_id",
foreignField: "_id",
as: "productDetail"
}
},
{
$unwind: {
path: "$productDetail",
preserveNullAndEmptyArrays: true
}
},
{
$set: {
inProductDetailCollectionButNotOnSale: {
$cond: [
{
$and: [
{
$gt: [
"$productDetail",
0
]
},
{
$ne: [
"$productDetail.onSale",
"Y"
]
}
]
},
true,
false
]
}
}
}
])
See HERE for a working example.
Note: for clarity and in case anyone else wonders why this part works:
$gt: [
"$productDetail",
0
]
Mongodb has a Comparison/Sort Order which lists the BSON types in order from lowest to highest. What I have done is compare the $productDetail
field against a Number
. Since Object
is 4th on the list and Number
is 3rd on the list it basically means if $productDetail
exists in the document then it will be on level 4 which is higher ($gt
) than a the number zero which is on level 3 on the list.
It returns true because it essentially translates to:
$gt: [
4, // 4 is greater than 3
3
]
If $productDetail
doesn't exist then it will return false because a null
is on level 2 and Number
is higher on level 3 so it translates to:
$gt: [
2, // 2 is NOT greater than 3
3
]