Search code examples
pythonmongodbmongodb-querypymongo

Update document if value there is no match


In Mongodb, how do you skip an update if one field of the document exists?

To give an example, I have the following document structure, and I'd like to only update it if the link key is not matching.

{
    "_id": {
        "$oid": "56e9978732beb44a2f2ac6ae"
    },
    "domain": "example.co.uk",
    "good": [
        {
            "crawled": true,
            "added": {
                "$date": "2016-03-16T17:27:17.461Z"
            },
            "link": "/url-1"
        },
        {
            "crawled": false,
            "added": {
                "$date": "2016-03-16T17:27:17.461Z"
            },
            "link": "url-2"
        }

    ]
}

My update query is:

links.update({
    "domain": "example.co.uk"
    },
    {'$addToSet':
        {'good':
            {"crawled": False, 'link':"/url-1"} }}, True)

Part of the problem is the crawl field could be set to True or False and the date will also always be different - I don't want to add to the array if the URL exists, regardless of the crawled status.

Update: Just for clarity, if the URL is not within the document, I want it to be added to the existing array, for example, if /url-3 was introduced, the document would look like this:

{
    "_id": {
        "$oid": "56e9978732beb44a2f2ac6ae"
    },
    "domain": "example.co.uk",
    "good": [
        {
            "crawled": true,
            "added": {
                "$date": "2016-03-16T17:27:17.461Z"
            },
            "link": "/url-1"
        },
        {
            "crawled": false,
            "added": {
                "$date": "2016-03-16T17:27:17.461Z"
            },
            "link": "url-2"
        },
        {
            "crawled": false,
            "added": {
                "$date": "2016-04-16T17:27:17.461Z"
            },
            "link": "url-3"
        }

    ]
}

The domain will be unique and specific to the link and I want it to insert the link within the good array if it doesn't exist and do nothing if it does exist.


Solution

  • The only way to do this is to find if there is any document in the collection that matches your criteria using the find_one method, also you need to consider the "good.link" field in your filter criteria. If no document matches you run your update query using the update_one method, but this time you don't use the "good.link" field in your query criteria. Also you don't need the $addToSet operator as it's not doing anything simple use the $push update operator, it makes your intention clear. You also don't need to "upsert" option here.

    if not link.find_one({"domain": "example.co.uk", "good.link": "/url-1"}):
        link.update_one({"domain": "example.co.uk"}, 
                        {"$push": {"good": {"crawled": False, 'link':"/url-1"}}})