I need to update values in an array in many MongoDB documents using a bulk write operation (the new array differs for each document). My question is: is there a way to update a subset of array values based on the element position range, passing an array to replace those values with?
Here is my document structure:
{
year: 2020,
location_id: 1,
values: [1.2 0 0 5.2 1.02 8 0 0 0 0 1.2 4]
}
Let's say I want to replace all values except the first one with this new array:
[1 2 5.1 2 4 0 87 1 0.2 2 9]
i.e., 'values' should become:
[1.2 1 2 5.1 2 4 0 87 1 0.2 2 9]
I know how to replace the array fully:
UpdateOne({'year': 2020, 'location_id': 1}, {'$set': {'values': [1.2 1 2 5.1 2 4 0 87 1 0.2 2 9]}})
But this first option requires me to query the collection first to get the first value of the array. I'd like to avoid that.
I also know how to replace values one by one with 11 separate updateone commands (here in python):
col.bulk_write([UpdateOne({'year': 2020, 'location_id': 1}, {'$set': {'values.1': 1}}),
UpdateOne({'year': 2020, 'location_id': 1}, {'$set': {'values.2': 2}}),
.... ,
UpdateOne({'year': 2020, 'location_id': 1}, {'$set': {'values.11': 9}})],
ordered = False)
This second option means I'm going to send millions of UpdateOne statements daily because I have several million documents that get updated every day and the array to be updated is much larger than in this simple example. I don't like that very much either.
Note that a related question is: Could this second option overload my server or is it not taking more resources than the first option anyway? The first option would mean a bulkwrite with, say 300'000 updateOne statements, each modifying a 365 long array, while the 2nd option would mean 300'000*364 updateOne statements, each modifying a single array element.
I would like to be able to do something like:
UpdateOne({'year': 2020, 'location_id': 1}, {'$set': {'values.1-end': [1 2 5.01 2 4 0 87 1 0.2 2 9]}})
i.e., specify a range of array positions to be replaced by a provided array. So far, I couldn't find how to do that. This would lead to a bulkwrite with 300'000 updateOne statements, each replace exactly those 364 values that need to be replaced.
NOTE: In this example it's replacing "all except the first value", but it may also be "all except the last value". Possibly also "replace elements at positions 4 to 9 by this array of length 6". I'm writing my code in python.
You can use an aggregation pipeline and use this for update:
db.collection.updateOne(
{'year': 2020, 'location_id': 1},
[{
$set: {
values: {
$concatArrays: [
[{ $arrayElemAt: ["$values", 0] }],
[1, 2, 5.1, 2, 4, 0, 87, 1, 0.2, 2, 9]
]
}
}
}]
)
For operations like "replace elements at positions 4 to 9 by this array of length 6" you may also use $slice
Also operator $range may help, see How to modify value in array by position at aggregation framework