Search code examples

word boundary in regex with mongolite

I am facing a problem using word boundary regex with mongolite. It looks like the word boundary \b does not work, whereas it works in norm MongoDB queries.

Here is a working example:

I create this toy collection:

   { item: "journal gouttiere"},
   { item: "notebook goutte"},
   { item: "paper plouf"},
   { item: "planner gouttement"},
   { item: "postcard goutte"}

With mongosh:

$match: {
    item: RegExp("\\bgoutte\\b")


    "_id": {
      "$oid": "63206efeb0e1e89db6ef0c20"
    "item": "notebook goutte"
    "_id": {
      "$oid": "63206efeb0e1e89db6ef0c23"
    "item": "postcard goutte"



connection <- mongo(collection="test2",db="test",
                    url = "mongodb://localhost:27017",
                    verbose = T)

connection$aggregate(pipeline = '[{
      "$match": {
      "item":{"$regex" : "\\bgoutte\\b", "$options" : "i"}
}]',options = '{"allowDiskUse":true}')

returns 0 lines. Changing to

connection$aggregate(pipeline = '[{
      "$match": {
      "item":{"$regex" : "goutte", "$options" : "i"}
}]',options = '{"allowDiskUse":true}')

 Imported 3 records. Simplifying into dataframe...
                       _id               item
1 63206efeb0e1e89db6ef0c20    notebook goutte
2 63206efeb0e1e89db6ef0c22 planner gouttement
3 63206efeb0e1e89db6ef0c23    postcard goutte

It looks like the word boundary regex does not work the same with mongolite. What is the proper solution ?


  • Ottie is right (and should post an answer!–I'd be fine with deleting mine then):

    Backslashes have special meaning for both R and in the regex. You need two additional backslashes (one per \) to pass \\ from R to mongoDB (where you escape \b by \\b), see e.g. this SO question. I just checked:

    con <- mongo(
     url = "mongodb+srv://readwrite:[email protected]/test"
    con$insert('{"item": "notebook goutte" }')
    con$insert('{"item": "postcard goutte" }')


    con$aggregate(pipeline = '[{
          "$match": {
          "item":{"$regex" : "\\\\bgoutte\\\\b", "$options" : "i"}
    }]',options = '{"allowDiskUse":true}')


                           _id            item
    1 63234ac1435f9b7c2a0787c2 notebook goutte
    2 63234ac5435f9b7c2a0787c5 postcard goutte