Search code examples
mongodbmongoosenosqlaggregation-framework

How to use lookup on two collection having object property of variable size


I have two collections one is company and other is reports so below is my company collection.

company collection

[
 {
  company_id:1,
  hubId:4
  dimensions:{ region:'North america',country:'USA'},
  name:'Amsol Inc.'
 },
 {
   company_id:1,
   hubId:4
   dimensions:{ region:'North america',country:'Canada'},
   name:'Amsol Inc.'
 },
 {
   company_id:2,
   hubId:7
   dimensions:{ region:'North america',country:'USA',revenue:34555},
   name:'Microsoft Inc.'
 }
]

reports collection

[
 {
  report_id:1,
  name:'example report',
  hubId:4
  dimensions:{ region:'North america',country:'USA'},
  name:'Amsol Inc.'
 },
 {
   report_id:2,
   name:'example report',
   hubId:4
   dimensions:{ region:'North america',country:'Canada'},
   name:'Amsol Inc.'
 },
 {
   report_id:3,
   name:'example report',
   hubId:5
   dimensions:{ region:'North america',country:'USA',revenue:20000},
   name:'Microsoft Inc.'
 }
 {
   report_id:4,
   name:'example report',
   hubId:4
   dimensions:{region:'North america',country:'Greenland'},
   name:'Amsol Inc.'
 },
]

OUTPUT

 [
  {
  report_id:1,
  name:'example report',
  hubId:4
  dimensions:{ region:'North america',country:'USA'},
  name:'Amsol Inc.'
 },
 {
   report_id:2,
   name:'example report',
   hubId:4,
   dimensions:{region:'North america',country:'Canada'},
   name:'Amsol Inc.'
 }
]

I want to fetch all the report that has same HubId and dimensions as of companies.

For Ex: hubId = 4 has 2 companies in company collection but they have different dimensions So here I want to search for all the reports having hubId = 4 and dimensions like matches with any of these companies.

If you noticed, the collections have thousands of records like this.

I have been looking for some aggregation pipeline but fail to understand how can I apply the logic here to get the result here.


Solution

  • If I understand you correctly, a $lookup with pipeline can do the job: The let defines the arguments from the reports collection, and they are used with $$ inside the $lookup pipeline. The $ arguments are for the context of the $lookup pipeline, meaning the company collection.

    The $setEquals is used to test equality of arrays where the order of items is not important. Here it is used with $objectToArray to test equality of objects (which are being changed to arrays and then compared).

    db.reports.aggregate([
      {$lookup: {
          from: "company",
          let: {hubId: "$hubId", dimensions: "$dimensions"},      
          as: "companies",
          pipeline: [
            {$match: {
                $expr: {
                  $and: [
                    {$eq: ["$hubId", "$$hubId"]},
                    {$setEquals: [
                        {$objectToArray: "$dimensions"},
                        {$objectToArray: "$$dimensions"}
                    ]}
                  ]
                }
            }},
            {$project: {_id: 1}}
          ]
      }},
      {$match: {"companies.0": {$exists: true}}},
      {$unset: "companies"}
    ])
    

    See how it works on the playground example