I have a collection of the following data:
{
"_id" : ObjectId("51f1fcc08188d3117c6da351"),
"cust_id" : "abc123",
"ord_date" : ISODate("2012-10-03T18:30:00Z"),
"status" : "A",
"price" : 25,
"items" : [{
"sku" : "ggg",
"qty" : 7,
"price" : 2.5
}, {
"sku" : "ppp",
"qty" : 5,
"price" : 2.5
}]
}
I am using the query:
cmd { "aggregate" : "orders" , "pipeline" : [
{ "$unwind" : "$items"} ,
{ "$match" : { "items" : { "$elemMatch" : { "qty" : { "$in" : [ 7]}}}}} ,
{ "$group" : { "price" : { "$first" : "$price"} , "items" : { "$push" : { "sku" : "$items.sku"}} , "_id" : { "items" : "$items"}}} ,
{ "$sort" : { "price" : -1}} ,
{ "$project" : { "_id" : 0 , "price" : 1 , "items" : 1}}
]}
Not able to understand what is going wrong
It's because you're doing $match
after $unwind
. $unwind
generates a new stream of documents where items
is no longer an array (see docs).
It emits each document as many times as there are items in it.
If you want to select documents with desired element in it and then process all of its documents, you should call $match
first:
db.orders.aggregate(
{ "$match" : { "items" : { "$elemMatch" : { "qty" : { "$in" : [ 7]}}}}},
{ "$unwind" : "$items"},
...
);
If you want to select items to be processed after $unwind
, you shoul remove $elemMatch
:
db.orders.aggregate(
{ "$unwind" : "$items"},
{ "$match" : { "items.qty" : { "$in" : [7]}}},
...
);
In first case you'll get two documents:
{
"price" : 25,
"items" : [
{"sku" : "ppp"}
]
},
{
"price" : 25,
"items" : [
{"sku" : "ggg"}
]
}
and in second case you'll get one:
{
"price" : 25,
"items" : [
{"sku" : "ggg"}
]
}
Update. After $unwind
your documents will look like:
{
"_id" : ObjectId("51f1fcc08188d3117c6da351"),
"cust_id" : "abc123",
"ord_date" : ISODate("2012-10-03T18:30:00Z"),
"status" : "A",
"price" : 25,
"items" : {
"sku" : "ggg",
"qty" : 7,
"price" : 2.5
}
}