Search code examples
regexmongodbaggregation-frameworknosql-aggregation

MongoDB aggregation - $regexMatch on array field


I'm using Mongo's $regexMatch operator to find documents where at least part of a field matches the pattern, and this works fine for root-level fields. But how do I use it with array fields? I want to return a match if at least one of the array elements matches the pattern.

For example, let's say the collection has these two documents:

{
  "_id": ObjectId("5ff6335c1570ba63ca5ac21e"),
  "requirements": [
    {
      "description": "Bachelor of Science required for this blah blah blah",
      "code": "ABC"
    },
    {
      "description": "Also much experience in JavaScript blah",
      "code": "XYZ"
    }
  ]
},

{
  "_id": ObjectId("5ff6335b1570ba63ca5abefb"),
  "requirements": [
    {
      "description": "Master of Arts WANTED NOW!",
      "code": "TTT"
    },
    {
      "description": "5+ experience required in C++",
      "code": "QQQ"
    }
  ]
}

and something like this pipeline

db.Collection.aggregate([
  { $match:
     { $expr:
        { $regexMatch: { 
          input: '$requirements.description', 
          regex: /^.*?\bblah blah blah\b.*?$/im 
        } } 
     } 
  }
])

should return just the first document since its first element inside requirements matches on description containing "blah blah blah" ("Bachelor of Science required for this blah blah blah").

However this just throws me an error saying '$regexMatch needs input to be of type string'. And replacing it with $requirements[0].description doesn't work either.

So is there a way to regex match array fields in Mongo?


Solution

  • $regexMatch allows only string input requirements having array It requires to iterate loop array values,

    • $reduce to iterate loop of description, check condition if expression match then return score otherwise return initial value
    db.collection.aggregate([
      {
        $addFields: {
          score: {
            $reduce: {
              input: "$requirements.description",
              initialValue: 0,
              in: {
                $cond: [
                  {
                    $eq: [
                      {
                        $regexMatch: {
                          input: "$$this",
                          regex: "blah blah blah"
                        }
                      },
                      true
                    ]
                  },
                  50,
                  "$$value"
                ]
              }
            }
          }
        }
      }
    ])
    

    Playground


    If you want filter documents the simply try $regex in $match stage,

    db.collection.aggregate([
      {
        $match: {
          "requirements.description": {
            $regex: "blah blah blah"
          }
        }
      }
    ])
    

    Playground