Search code examples
mongodbaggregation-frameworkaggregation

Mongodb counting combination of items in array


I have something like: `

[
{
    ....
    tags : ["A","B"]
},
{
    ....
    tags : ["A","B"]
},
{
    ....
    tags : ["J","K"]
},
{
    ....
    tags : ["A","B","C"]
}
]`

With the Aggregation Framwork I'd like to group by array combinations to have something like this:

[
{
    _id:["A","B"],
    count : 3
},
{
    _id:["J","K"],
    count : 1
},
{
    _id:["A","C"],
    count : 1

   },
{
        _id:["B","C"],
        count : 1
    },
]

is it possible? Thanks in advance


Solution

  • Query

    • for each tag of size 1, makes it ["V"] -> ["V" null] (if you dont want to count the pairs with the null you can filter out in the final results)
    • map and for each member we say its together with the rest members For example ["A" "B" "C"] will make "A" together with ["B" "C"], "B" together with ["A" "C"] etc (we count double, we will divide next with 2)
    • unwind that arrays
    • group by pairs, but shorted pairs ["A" "B"] = ["B" "A"]
    • take the count 0.5 (instead of 1) (its like we divide with 2)

    Test code here

    aggregate(
    [ {
      "$set" : {
        "tags" : {
          "$cond" : [ {
            "$eq" : [ {
              "$size" : "$tags"
            }, 1 ]
          }, {
            "$concatArrays" : [ "$tags", [ null ] ]
          }, "$tags" ]
        }
      }
    }, {
      "$set" : {
        "a" : {
          "$map" : {
            "input" : "$tags",
            "in" : {
              "member" : "$$t",
              "togetherWith" : {
                "$setDifference" : [ "$tags", [ "$$t" ] ]
              }
            },
            "as" : "t"
          }
        }
      }
    }, {
      "$unwind" : {
        "path" : "$a"
      }
    }, {
      "$replaceRoot" : {
        "newRoot" : "$a"
      }
    }, {
      "$unwind" : {
        "path" : "$togetherWith"
      }
    }, {
      "$group" : {
        "_id" : {
          "$cond" : [ {
            "$lt" : [ "$member", "$togetherWith" ]
          }, [ "$member", "$togetherWith" ], [ "$togetherWith", "$member" ] ]
        },
        "count" : {
          "$sum" : 0.5
        }
      }
    }, {
      "$set" : {
        "count" : {
          "$toInt" : "$count"
        }
      }
    } ]
    )