I have collection with documents like this :
{
"_id" : ObjectId("5c0685fd6afbd73b80f45338"),
"page_id" : "1234",
"category_list" : [
"football",
"sport"
],
"time_broadcast" : "09:13"
}
{
"_id" : ObjectId("5c0685fd6afbd7355f45338"),
"page_id" : "1234",
"category_list" : [
"sport",
"handball"
],
"time_broadcast" : "09:13"
}
{
"_id" : ObjectId("5c0694ec6afbd74af41ea4af"),
"page_id" : "123456",
"category_list" : [
"news",
"updates"
],
"time_broadcast" : "09:13"
}
....
now = datetime.datetime.now().time().strftime("%H:%M")
What i want is : when "time_broadcast" is equal to "now",i get list of distinct "category_list" of each "page_id".
Here is how the output should look like :
{
{
"page_id" : "1234",
"category_list" : ["football", "sport", "handball"]
},
{
"page_id" : "123456",
"category_list" : ["news", "updates"]
}
}
I have tried like this :
category_list = db.users.find({'time_broadcast': now}).distinct("category_list")
but this gives me as output list of distinct values but
of all "page_id" :
["football", "sport", "handball","news", "updates"]
not category_list by page_id .
Any help please ?
Thanks
you need to write an aggregate pipeline
$match
- filter the documents by criteria$group
- group the documents by key field$addToSet
- aggregate the unique elements$project
- project in the required format$reduce
- reduce the array of array to array by $concatArrays
aggregate query
db.tt.aggregate([
{$match : {"time_broadcast" : "09:13"}},
{$group : {"_id" : "$page_id", "category_list" : {$addToSet : "$category_list"}}},
{$project : {"_id" : 0, "page_id" : "$_id", "category_list" : {$reduce : {input : "$category_list", initialValue : [], in: { $concatArrays : ["$$value", "$$this"] }}}}}
]).pretty()
result
{ "page_id" : "123456", "category_list" : [ "news", "updates" ] }
{
"page_id" : "1234",
"category_list" : [
"sport",
"handball",
"football",
"sport"
]
}
you can add $sort
by page_id
pipeline if required