Search code examples
mongodbmongodb-queryaggregation-frameworkmongo-java-driver

Finding perfect subsets


We have requests collection :

{
    "_id" : ObjectId("xxxxxx"),
    "requestId" : "REQ4",
    "scrips" : "[3553, 5647]"
}

{
    "_id" : ObjectId("xxxxxx"),
    "requestId" : "REQ1",
    "scrips" : "[0001]"
}

and Scrips collection:

{
    "_id" : ObjectId("xxxx"),
    "scrip" : "0001"
}
{
    "_id" : ObjectId("xxxx"),
    "scrip" : "0456"
}

We need to get a list of all requests in requests collection - who have scrips as a complete subset of the scrip collections.

Expected output : ['REQ1']

How do we achieve this in the most efficient way?
We are using java mongo driver 3.7


Solution

  • First of all, try saving scrips in request collection as an array of string instead of a string of array-like ["3553"] making it:

    {
        "_id" : ObjectId("xxxxxx"),
        "requestId" : "REQ4",
        "scrips" : ["3553", "5647"]
    }
    

    After that, you can use below to maintain an array of all scrips from scrip collection:

    let allScrips=await scrips.aggregate([
      {$group:{
       _id:null,scrips:{$addToSet:"$scrip"}
      }}
    ])
    

    After that use $setIsSubset in the aggregate pipeline to compare above allScrips[0].scrips array with request collection's scrips only if allScrips.length comes

    https://docs.mongodb.com/manual/reference/operator/aggregation/setIsSubset/

    let requestsWithScrips=await requests.aggregate([
      { $project: {requestId:1,isSubset: { $setIsSubset: [ "$scrips",allScrips[0].scrips] }}},
      { $match:{isSubset:true}}
    ])