I have documents that look like this
{
"field1" : "value",
"field3" : "value",
"attributes" : {
"key1" : {
"category" : "4",
"value" : "value"
},
"key2" : {
"category" : "5",
"value" : "value"
},
}
}
I want to transform these to look like the structure below (basically it's taking a dictionary and flattening it into an array). Note the number of keys in attributes
varies.
{
"field1" : "value",
"field3" : "value",
"attributes" : [
{
"key" : "key1",
"category" : "4",
"value" : "value"
},
{
"key" : "key2",
"category" : "5",
"value" : "value"
},
]
}
Because the number of documents is significant, I would like to use the bulk APIs. I have this basic structure to my script.
db = db.getSiblingDB('MyDB');
db.MyCollection.intializeOrderedBukOp(),
var bulk = db.MyCollection.intializeOrderedBukOp(),
count = 0;
db.MyCollection.find({}).forEach(function(doc) {
// Re-map attrs (what follows is some pseudocode
var attrs = function TransformSomehow(doc.attributes) {
// return an array...
};
});
// Queue update operation
bulk.find({ "_id": doc._id })
.updateOne({
"$set": { "attrs": attrs } }
});
count++;
// Execute every 1000
if ( count % 1000 == 0 ) {
bulk.execute();
bulk = db.collection.intializeOrderedBukOp();
}
});
// Drain any queued remaining
if ( count % 1000 != 0 )
bulk.execute();
This was borrowed from this answer, but I can't use it as is because it's going the opposite direction (array to object). It appears that I can't use map
to do this.
This is one way to do bulk write using the MongoDB NodeJS driver v3.6 and MongoDB v4.2.3. This uses a simple batch to update all the collection documents. The update transforms the object (or dictionary) into an array (as per your requirement) for each document.
const collection = db.collection('test');
const update1 = {
updateMany: {
filter: { },
update: [
{ $set: {
attributes: {
$map: {
input: { $objectToArray: "$attributes" },
in: { $mergeObjects: [ { "key": "$$this.k" }, "$$this.v" ] }
}
}
} }
]
}
};
collection.bulkWrite( [ update1 ],
{ ordered: false },
( err, result ) => {
console.log( 'Updated and modified count: ', result.matchedCount, result.modifiedCount );
}
)
Reference: CRUD Operations - Bulk Write