Search code examples
c#.netmongodbmongodb-querymongodb-.net-driver

How can I count items in mongo collection that have intersecting subcollections?


For every item in my collection I need to find count of other items that have intersecting subcollections. For example, given this collection

[{id:1,"sub":[1, 2, 3]},
{id:2,"sub":[2, 3, 4]},
{id:3,"sub":[4, 5, 6],
{id:4,"sub":[7, 8, 9]}]

expected result is

[{id:1,"count":1},
{id:2,"count":2},
{id:3,"count":1},
{id:4,"count":0"}]

Solution

  • Starting with the algorithm in pure MongoDB query language: You have to restructure your documents so that each document contains it's initial sub array and an array of all the other sub values. To do that you need to run $group along with $unwind. Then it becomes easy to just run $map with $setIntersect $filter out all empty and equal to self arrays and get the size using $size

    db.collection.aggregate([
        {
            $group: {
                _id: null,
                current: { $push: "$$ROOT" },
                all: { $push: "$sub" }
            }
        },
        {
            $unwind: "$current"
        },
        {
            $project: {
                id: "$current.id",
                count: {
                    $size: {
                        $filter: {
                            input: {
                                $map: {
                                    input: "$all",
                                    in: { $setIntersection: [ "$$this", "$current.sub" ] }
                                }
                            },
                            cond: {
                                $and: [ 
                                    { $ne: [ "$$this", [] ] },
                                    { $ne: [ "$$this", "$current.sub" ]}
                                ]
                            }
                        }
                    }
                }
            }
        }
    ])
    

    Mongo Playground

    Since the aggregation is quite complex there's no point in running it in a strongly-typed way in C#. All you can do is to use BsonDocument class to build your pipeline like:

    var groupDef = new BsonDocument()
            {
                { "_id", "" },
                { "current", new BsonDocument(){ { "$push", "$$ROOT" } } },
                { "all", new BsonDocument(){ { "$push", "$sub" } } },
            };
    
    var projectDef = BsonDocument.Parse(@"{
            id: ""$current.id"",
            _id: 0,
            count: {
            $size: {
                $filter: {
                input: {
                    $map: {
                    input: ""$all"",
                    in: {
                        $setIntersection: [
                        ""$$this"",
                        ""$current.sub""
                        ]
                    }
                    }
                },
                cond: {
                    $and: [
                    {
                        $ne: [
                        ""$$this"",
                        []
                        ]
                    },
                    {
                        $ne: [
                        ""$$this"",
                        ""$current.sub""
                        ]
                    }
                    ]
                }
                }
            }
            }
        }");
    
    var result = mongoDBCollection.Aggregate()
                                    .Group(groupDef)
                                    .Unwind("current")
                                    .Project(projectDef)
                                    .ToList();