Search code examples
mongodbmongodb-queryaggregation-frameworknosql-aggregation

Mongo - Group documents with different regexes


I have a collection called vacancy with those fields:

[{
  "name": "Software Developer",
  "published_at": "2022-08-31"
},
{
  "name": "Tech Lead",
  "published_at": "2022-08-31"
},
{
  "name": "Team Lead",
  "published_at": "2022-08-31"
},
{
  "name": "Software Engineer",
  "published_at": "2022-08-31"
}]

What would be the aggregation query that would returns to me two groups based on two REGEXes which will verify a word existence in the field name.

Something like that:

[{
  "_id": "Software",
  "count": 2
},
{
  "_id": "Lead",
  "count": 2
}]


Solution

  • Query1

    • group by null (all collection 1 group)
    • create two custom groups
    • to the first sum 1 if Software else 0
    • to the second sum 1 if Lead else 0

    Playmongo

    aggregate(
    [{"$group": 
       {"_id": null,
        "Software": 
         {"$sum": 
           {"$cond": 
             [{"$regexMatch": {"input": "$name", "regex": "Software"}}, 1, 0]}},
        "Lead": 
         {"$sum": 
           {"$cond": 
             [{"$regexMatch": {"input": "$name", "regex": "Lead"}}, 1, 0]}}}}])
    

    If you want the exact same output like in your example try this, but maybe the first query is ok also.

    Query2

    • like the above, but root becomes array, that is unwinded
    • and finaly replaces the root to get the 2 documents like in your expected output

    Playmongo (you can put the mouse in the end of each stage to see what it does)

    aggregate(
    [{"$group": 
       {"_id": null,
        "Software": 
         {"$sum": 
           {"$cond": 
             [{"$regexMatch": {"input": "$name", "regex": "Software"}}, 1, 0]}},
        "Lead": 
         {"$sum": 
           {"$cond": 
             [{"$regexMatch": {"input": "$name", "regex": "Lead"}}, 1, 0]}}}},
     {"$unset": ["_id"]},
     {"$project": {"array": {"$objectToArray": "$$ROOT"}}},
     {"$unwind": "$array"}, {"$replaceRoot": {"newRoot": "$array"}},
     {"$project": {"_id": "$k", "count": "$v"}}])