Search code examples
mongodbmongodb-querykey

mongodb how to query records that have at least N key in giving keys?


How to find records the have at least N same keys with the specific record?

For example the specific record has k1, k2, k3, k4, k5 five keys. How to find records that has at least 3 keys than in the five keys?

For example current data is below and find docs that has at least 3 keys in k1,k2,k3,k4

doc1 {k2:1, k3:1, k4:1, k5:1, k6:1}
doc2 {k1:1, k3:1, k4:1, k5:1, k7:1}
doc3 {k1:1, k2:1, k4:1, k5:1, k8:1}
doc4 {k1:1, k2:1, k3:1, k5:1}
doc5 {k1:1, k2:1}

expected output:

doc1 {k2:1, k3:1, k4:1}
doc2 {k1:1, k3:1, k4:1}
doc3 {k1:1, k2:1, k4:1}
doc4 {k1:1, k2:1, k3:1}

Solution

  • You can do the followings in an aggregation pipeline:

    1. use $objectToArray to convert the $$ROOT document into an array of k-v tuples
    2. use $reduce to count for array entries that the field are in [k1,k2,k3,k4]
    3. keep only the count >= 3(or any other threshold you specified)
    4. $project to your desired form
    db.collection.aggregate([
      {
        "$addFields": {
          "validCount": {
            "$objectToArray": "$$ROOT"
          }
        }
      },
      {
        "$addFields": {
          "validCount": {
            "$reduce": {
              "input": "$validCount",
              "initialValue": 0,
              "in": {
                "$cond": {
                  "if": {
                    "$in": [
                      "$$this.k",
                      [
                        "k1",
                        "k2",
                        "k3",
                        "k4"
                      ]
                    ]
                  },
                  "then": {
                    "$add": [
                      "$$value",
                      1
                    ]
                  },
                  "else": "$$value"
                }
              }
            }
          }
        }
      },
      {
        "$match": {
          validCount: {
            $gte: 3
          }
        }
      },
      {
        "$project": {
          _id: 1,
          k1: 1,
          k2: 1,
          k3: 1,
          k4: 1
        }
      }
    ])
    

    Here is a Mongo playground for your reference.