javascript mongodb mongodb-query aggregation-framework aggregate

how to $bucket only for unique document based on some field

I am using a Mongo aggregated framework, suppose if I am having collection structure like this

{
  {
    _id: ObjectId(123)
    name: john,
    sessionDuration: 29
  },
  {
    _id: ObjectId(456)
    name: moore,
    sessionDuration: 45
  },
  {
    _id: ObjectId(789)
    name: john,
    sessionDuration: 25
  },
  {
    _id: ObjectId(910)
    name: john,
    sessionDuration: 45
  },
  etc...
}

user with the same name is the one who is using different sessions like in the following example: John is using service from three device with 3 sessions durations are: 2 less than 30 (29,25) and 1 less than 50(45).

I want to do a bucket query for boundaries [0,30,50] but in the range it must only count the user with a unique names, no same name user with less than 30 or 50 duration count more than one, means the result should look like this

{
  time: Unique_Name_Users_Only_Lies_In_This_Boundary,
  ‘30’: 1,
  ‘50’: 2,
}

so john was having 2 sessions less than 30 duration so we only need 1 from these two.

What I tried: I group all the docs first with unique name only, then apply bucket. but this approach will also skip the john with 45 sessionDuration.

How can I only get the unique name document count in a particular duration of $bucket boundary?

Solution

One option is to use the $bucket with $addToSet and then use $group with $arrayToObject to get your formatting:

db.collection.aggregate([
  {$bucket: {
      groupBy: "$sessionDuration",
      boundaries: [0, 30, 50],
      default: "Other",
      output: {res: {$addToSet: "$name"}}
  }},
  {$group: {
      _id: 0,
      res: {$push: {k: {$toString: "$_id"},  v: {$size: "$res"}}}
  }},
  {$replaceRoot: {newRoot: {$arrayToObject: "$res"}}}
])

See how it works on the playground example

Notice that the _id of a bucket is its lower boundary. You can manipulate this if you really want, but I don't recommend it