Search code examples
mongodbcontinuous

Finding ranges of continuous values


I have the following Mongo collection:

[
  {
    "key": 1,
    "user": "A",
    "comment": "commentA1"
  },
  {
    "key": 2,
    "user": "A",
    "comment": "commentA2"
  },
  {
    "key": 5,
    "user": "A",
    "comment": "commentA5"
  },
  {
    "key": 2,
    "user": "B",
    "comment": "commentB2"
  },
  {
    "key": 3,
    "user": "B",
    "comment": "commentB3"
  },
  {
    "key": 6,
    "user": "B",
    "comment": "commentB6"
  }
]

and I need to find the first continuous keys, with no gaps, per user. So, for user A I should get the first 2 documents, and for user B the first two also. The collection might contain more than 2M documents, so the query should work fast.

I have found SQL solutions for this problem (http://www.silota.com/docs/recipes/sql-gap-analysis-missing-values-sequence.html in section number 3), but I am looking for a Mongo solution.

How can I do it in Mongo 4.0 (DocumentDB) ?


Solution

  • EDIT: according to further elaboration on the comments,

    One option is:

    db.collection.aggregate([
      {$sort: {key: 1}},
      {$group: {
          _id: "$user",
          data: {$push: {key: "$key", comment: "$comment"}},
          shadow: {$push: {$add: ["$key", 1]}}
      }},
      {$project: {
          data: 1,
          shadow: {$filter: {input: "$shadow", cond: {$in: ["$$this", "$data.key"]}}}
      }},
      {$project: {data: 1, shadow: 1, firstItem: {$subtract: [{$first: "$shadow"}, 1]}}},
      {$project: {data: 1, firstItem: 1, shadow: {$concatArrays: [["$firstItem"], "$shadow"]}}},
      {$project: {
          data: 1,
          shadow: {$reduce: {
              input: {$range: [0, {$size: "$shadow"}]},
              initialValue: [],
              in: {
                $concatArrays: [
                  "$$value",
                  {$cond: [
                      {$eq: [
                          {$arrayElemAt: ["$shadow", "$$this"]},
                          {$add: ["$$this", "$firstItem"]}
                      ]},
                      [{$arrayElemAt: ["$shadow", "$$this"]}],
                      []
                  ]},
                ]
              }
            }
          }
        }
      },
      {$project: {data: {$filter: {input: "$data", cond: {$in: ["$$this.key", "$shadow"]}}}}},
      {$unwind: "$data"},
      {$project: {comment: "$data.comment", key: "$data.key"}}
    ])
    

    See how it works on the playground example