Search code examples
node.jsmongodbmongo-shell

mongodb foreach doc find min of subset and update it


Product and variants price

Find Minimum of all products.Variants.Price where size is small and update it by 15%

    {
        "_id" : 23,
        "name" : "Polo Shirt",
        "Variants" : [ 
            {
                "size" : "Large",
                "Price" : 82.42
            }, 
            {
                "size" : "Medium",
                "Price" : 20.82 // this should get increased by 15%
            }, 
            {
                "size" : "Small",
                "Price" : 42.29
            }
        ]
    },
{
        "_id" : 24,
        "name" : "Polo Shirt 2",
        "Variants" : [ 
            {
                "size" : "Large",
                "Price" : 182.42
            }, 
            {
                "size" : "Medium",
                "Price" : 120.82  // this should get increased by 15%
            }, 
            {
                "size" : "Small",
                "Price" : 142.29
            }
        ]
    }

I started something like this. Not sure if this is the right start

db.products.find().forEach(function(product){
    var myArr = product.Variants;
    print(myArr.min());
});

Solution

  • There is a problem here in that you cannot in a single update statement identify the "minimum" value in an array to use with a positional update, so you are right in a way with your current approach.

    It is arguable that a better approach would be to pre-determine which element is the minimal element and this pass that to the update. You can do this using .aggregate():

    var result = db.products.aggregate([
        { "$unwind": "$Variants" },
        { "$sort": { "_id": 1, "Variants.price" } }
        { "$group": {
            "_id": "$_id",
            "size":  { "$first": "$Variants.size" },
            "price": { "$first": "$Variants.price" }
        }},
        { "$project": {
            "size": 1,
            "price": 1,
            "adjusted": { "$multiply": [ "$price", 1.15 ] }
        }}
    ])
    

    So of course that is yet only a result with simply the lowest Variant item details for each product but then you could use the results like this:

    result.result.forEach(function(doc) {
        db.products.update(
            { 
                "_id": doc._id, 
                "Variants": { "$elemMatch": {
                    "size": doc.size,
                    "price": doc.price
                }}
            },
            {
                "$set": { 
                    "Variants.$.price": doc.adjusted
                 }
            } 
        }
    })
    

    That is not the best form but it does at least remove some of the overhead with iterating an array and allows a way to do the calculations on the server hardware, which is possibly of a higher spec from the client.

    It still doesn't really look like too much though until you take in some features available for MongoDB 2.6 and upwards. Notably that aggregate gets a cursor for a response and that you can now also do "bulk updates". So the form can be changed like so:

    var cursor = db.products.aggregate([
        { "$unwind": "$Variants" },
        { "$sort": { "_id": 1, "Variants.price" } }
        { "$group": {
            "_id": "$_id",
            "size":  { "$first": "$Variants.size" },
            "price": { "$first": "$Variants.price" }
        }},
        { "$project": {
            "size": 1,
            "price": 1,
            "adjusted": { "$multiply": [ "$price", 1.15 ] }
        }}
    ]);
    
    var batch = [];
    
    while ( var doc = cursor.next() ) {
    
        batch.push({
            "q": {
                "_id": doc._id, 
                "Variants": { "$elemMatch": {
                    "size": doc.size,
                    "price": doc.price
                }}
            },
            "u": {
                "$set": { 
                    "Variants.$.price": doc.adjusted
                 }
            }
        });
    
        if ( batch.length % 500 == 0 ) {
            db.command({ "update": "products", "updates": batch });
        }
    }
    
    db.command({ "update": "products", "updates": batch });
    

    So that is really nice in that while you are still iterating over a list the traffic and waiting for responses over the wire has really been minimized. The best part is the batch updates which are occurring ( by the math usage ) only once per 500 items. The maximum size of the batch items is actually the BSON limit of 16MB so you can tune that as appropriate.

    That gives a few good reasons if you are currently developing a product to move to the 2.6 version.

    The only final footnote I would add considering you are dealing with "prices" is try not to use floating point math for this and look for a form using whole integers instead as it avoids a lot of problems.