I am facing a problem using word boundary regex with mongolite
. It looks like the word boundary \b
does not work, whereas it works in norm MongoDB queries.
Here is a working example:
I create this toy collection:
db.test2.insertMany([
{ item: "journal gouttiere"},
{ item: "notebook goutte"},
{ item: "paper plouf"},
{ item: "planner gouttement"},
{ item: "postcard goutte"}
]);
With mongosh
:
db.test2.aggregate(
{
$match: {
item: RegExp("\\bgoutte\\b")
}
})
returns:
[
{
"_id": {
"$oid": "63206efeb0e1e89db6ef0c20"
},
"item": "notebook goutte"
},
{
"_id": {
"$oid": "63206efeb0e1e89db6ef0c23"
},
"item": "postcard goutte"
}
]
But:
library(mongolite)
connection <- mongo(collection="test2",db="test",
url = "mongodb://localhost:27017",
verbose = T)
connection$aggregate(pipeline = '[{
"$match": {
"item":{"$regex" : "\\bgoutte\\b", "$options" : "i"}
}
}]',options = '{"allowDiskUse":true}')
returns 0 lines. Changing to
connection$aggregate(pipeline = '[{
"$match": {
"item":{"$regex" : "goutte", "$options" : "i"}
}
}]',options = '{"allowDiskUse":true}')
Imported 3 records. Simplifying into dataframe...
_id item
1 63206efeb0e1e89db6ef0c20 notebook goutte
2 63206efeb0e1e89db6ef0c22 planner gouttement
3 63206efeb0e1e89db6ef0c23 postcard goutte
It looks like the word boundary regex does not work the same with mongolite. What is the proper solution ?
Ottie is right (and should post an answer!–I'd be fine with deleting mine then):
Backslashes have special meaning for both R and in the regex. You need two additional backslashes (one per \
) to pass \\
from R to mongoDB (where you escape \b
by \\b
), see e.g. this SO question. I just checked:
con <- mongo(
"test",
url = "mongodb+srv://readwrite:[email protected]/test"
)
con$insert('{"item": "notebook goutte" }')
con$insert('{"item": "postcard goutte" }')
Now
con$aggregate(pipeline = '[{
"$match": {
"item":{"$regex" : "\\\\bgoutte\\\\b", "$options" : "i"}
}
}]',options = '{"allowDiskUse":true}')
yields
_id item
1 63234ac1435f9b7c2a0787c2 notebook goutte
2 63234ac5435f9b7c2a0787c5 postcard goutte