Search code examples
mongodbpymongomongodb-querymongodb-update

MongoDB find and modify request update


I have the following structure in my MongoDB database for a product :

product_1 = {
    'name':'...',
    'logo':'...',
    'nutrition':'...',
    'brand':{
        'name':'...',
        'logo':'...'
        },
    'stores':{[
            'store_id':...,
            'url':'...',
            'prices':{[
                        'date':..., 
                        'price':...
                       ]}
             ]}
    })

My pymongo script goes from store to store and I try to do the following thing :

  • if the product is not in the database : add all the informations about the product with the current price and date for the current store_id.
  • if the product is in the database but I don't have any entries for the current stroe : add an entry in stores with the current price, date and store_id.
  • if the product is in the database and I have a price entry for the current shop but the current price is not the same : add a new entry with the new date and price for the current store_id.

Is it possible to do all in one request ? For now I have been trying the following, without really knowing how to handles the stores and prices case.
Maybe it is not the best way to contruct my database, I am open to suggestions.

db.find_and_modify(
                query={'$and':[
                        {'name':product['name']},
                        {'stores':{'$in':product['store_id']}}
                        ]},
                update={
                    '$setOnInsert':{
                                    'name':product['product_name'],
                                    'logo':product['product_logo'],
                                    'brand':product['brand'],
                                    [something for stores and prices ?]
                                    },
                    },
                upsert=True
                )

Solution

  • There isn't presently (MongoDB 2.6) a way to do all of those things in one query. You can retrieve the document and do the updates, then save the updated version (or insert the new document):

    oldDoc = collection.find_one({ "name" : product["name"] })
    if oldDoc:
        # examine stores + prices to create updated doc called newDoc
    else:
        # make a new doc, newDoc, for the product
    collection.save(newDoc) # you can use save if you put the same _id on newDoc as oldDoc
    

    Alternatively, I think your nested array schema is a cause of this headache and may cause more headaches down the line (e.g. update the price for a particular product for a particular date and store - cannot do this with a single database call). I would make each document represent the lowest level of one of your nested arrays - a particular product for sale for a particular price at a particular store on a particular date:

    {
        "product_name" : "bacon mayonnaise",
        "store_id" : "123ZZZ8",
        "price" : 99,
        "date" : ISODate("2014-12-31T17:18:53.944Z")
        // other stuff
    }
    

    You can duplicate a lot of the generic product information in each document or store it in a separate document - whether it's worth duplicating some information versus making another trip to the database to recall product details depends on how your going to use the documents. Notice that the problem you're trying to solve just disappears with this structure - you just do an upsert for a given product, store, price, and date.