Search code examples
node.jsmongodbnode-mongodb-native

Distinct values from various fields in MongoDB collection


I am using node-mongodb-native to fire mongodb queries using node js. There is a collection name 'locations', which have following fields:

sublocality1, sublocality2, sublocality3, city.

I want to fetch overall distinct values from these fields.

Eg: Documents:

{
  'sublocality1':'a',
  'sublocality2':'a',
  'sublocality3': 'b',
  'city': 'c'
}

{
  'sublocality1':'b',
  'sublocality2':'a',
  'sublocality3': 'b',
  'city': 'a'
}

The query should return

['a' , 'b', 'c']

I tried following:

Run distinct queries for each of the fields:
collection.distinct('sublocality1',..){},
collection.distinct('sublocality2',..){},
collection.distinct('sublocality3',..){},
collection.distinct('city',..){}

Insert the result from these queries into a list, and search for distinct items across list.

Can I optimize this? Is it possible running a single query?


Solution

  • You could aggregate it on the database server as below:

    • Group Individual document, to get the values of each intended field in an array.
    • Project a field named values as the union of all the intended field values, using the $setUnion operator.
    • Unwind values.
    • Group all the records, to get the distinct values.

    Code:

    Collection.aggregate([
    {$group:{"_id":"$_id",
             "sublocality1":{$push:"$sublocality1"},
             "sublocality2":{$push:"$sublocality2"},
             "sublocality3":{$push:"$sublocality3"},
             "city":{$push:"$city"}}},
    {$project:{"values":{$setUnion:["$sublocality1",
                                    "$sublocality2",
                                    "$sublocality3",
                                    "$city"]}}},
    {$unwind:"$values"},
    {$group:{"_id":null,"distinct":{$addToSet:"$values"}}},
    {$project:{"distinct":1,"_id":0}}
    ],function(err,resp){
       // handle response
    })
    

    Sample o/p:

    { "distinct" : [ "c", "a", "b" ] }
    

    If you want the results to be sorted, you could apply a sort stage in the pipeline before the final project stage.