Say for every document of a collection, it has an string array. how could I count the repetitive time of every element of the array in all this collection? Right now I can find all the distinct element, but then Map Reduce function is a little tricky that I haven't fully understood.
Doc A
{
_id:
name:
actors: ["a", "b", "c"]
}
Doc B
{
_id:
name:
actors: ["a", "d"]
}
Doc C
{
_id:
name:
actors: ["a", "c", "f"]
}
I wanne get a statistic result with a:3 b:1 c:2 d:1 f:1.
An alternative route that you could take is the aggregation framework. Considering the above collection as an example
Populate test collection:
db.collection.insert([
{ "_id" : 1, "name" : "ABC1", "actors": ["a", "b", "c"] },
{ "_id" : 2, "name" : "ABC2", "actors" : ["a", "d"] },
{ "_id" : 3, "name" : "XYZ1", "actors" : ["a", "c", "f"] }
])
Using MongoDB 3.4.4 or newer:
db.collection.aggregate([
{ "$unwind" : "$actors" },
{ "$group": { "_id": "$actors", "count": { "$sum": 1} } },
{ "$group": {
"_id": null,
"counts": {
"$push": {
"k": "$_id",
"v": "$count"
}
}
} },
{ "$replaceRoot": {
"newRoot": { "$arrayToObject": "$counts" }
} }
])
Output
{
a: 3,
b: 1,
c: 2,
d: 1,
f: 1
}
Using MongoDB 3.2 and below:
The following aggregation pipeline operation uses the $unwind
stage to output a document for each element in the actors
array and the $group
stage to group the documents by the value in the actors
array then
counts the number of documents per each group (which gives the occurrence of the array elements as a group) by way of the $sum
operator:
db.collection.aggregate([
{ "$unwind" : "$actors" },
{ "$group": { "_id": "$actors", "count": { "$sum": 1} } }
])
The operation returns the following results which would be a close match to your expectations but won't give you the documents as key/value pair:
/* 0 */
{
"result" : [
{
"_id" : "f",
"count" : 1
},
{
"_id" : "d",
"count" : 1
},
{
"_id" : "c",
"count" : 2
},
{
"_id" : "b",
"count" : 1
},
{
"_id" : "a",
"count" : 3
}
],
"ok" : 1
}