I have a collection in MongoDB.(Technically it is the cosmosDB API for MongoDB but in theory that shouldn’t matter.) The collection contains nested arrays of strings. These strings contain brackets inside them, and I need to remove them from the strings inside the arrays. Sample (fake) data:
{
“Thing_id”: “100”,
“Thing_name”: “Thing100”
“Comments”: [
“Good: [Thing100 is awesome]”,
“Bad: [I will never buy Thing100 again.]”
]
},
{
“Thing_id”: “101”,
“Thing_name”: “Thing101”
“Comments”: [
“Comparative: [Thing101 is so much better than Thing100.]”,
“Bad: [Who designed such piece of …]”
]
}
]
Any above syntax errors are because I am typing it on my phone…my apologies if it is not well-formed json/bson. To recap, I want to remove just the brackets in each value in the comments arrays. I am using pymongo to connect to the DB, and also python 3.7.4.
I have searched for how to do this, but all I can find are how to update the whole value in the array instead of replacing a character in the string inside the array.
FindAndModify() looked promising until I found out it will only do the first document found.
Is the only option to query the collection and loop through each document?
I don't have access to Azure Cosmos DB, so I can't test. If you are using pymongo
and want to update all documents in the collection, you probably want to use update_many
.
Here are a couple parameters for update_many
that may work for you.
filter = {'$expr': {'$gt': [{'$size': '$Comments'}, 0]}}
update = [
{
"$set": {
# rewrite Comments
"Comments": {
"$map": {
"input": "$Comments",
"as": "comment",
"in": {
"$reduce": {
"input": {
# get all matches without "[" or "]"
# every comment needs to be a string
"$regexFindAll": {
"input": "$$comment",
# backslashes "\" are used to "escape" brackets
# the number of "\" required may depend on platform, etc.
"regex": "[^\\[\\]]*"
}
},
"initialValue": "",
"in": {
# concat all matches
"$concat": ["$$value", "$$this.match"]
}
}
}
}
}
}
}
]
You can try this (slightly modified for the language/platform) on mongoplayground.net.