Search code examples
python-3.xpymongotweets

How to utilize OR statements and variable assignment in PyMongo?


I am working with tweets. I would like to assign brand titles and ad titles to each tweet. I am working in python rather than strictly mongo because I want to automate this process for an upcoming event. Overall, I am processing over 700k tweets. I would like to work in PyMongo as much as possible so that my processing times can be short and sweet.

The following code is related to a much broader script that gathers and aggregates the tweets before this point. This is only related to assigning an ad based on regular expression values. My issue is that all the tweets in my test db are being updated as TRUE, even if they do not hold the allotted RE value.

col.update_many({},
                {'$set': {"AdName": 'x'} }
                )

col.update_many({"AdName": {"$exists": True}},
                [{'$set':
                    {'AdName':
                        {"$or":
                        [{'$eq':[{'text':re.compile('BudLight')},'TestAdName']}]
                        }
                    }
                  }
                ]
                )

What I'd like to see happening

I am attempting to update these tweets to hold a specific ad name based on the RE values in the or statement. Afterwords, I will assign a brand to each tweet based on the ad titles that have been assigned through this process.

I assume my update syntax may be incorrect, but the PyMongo documentation isn't all that helpful in relation to what I am attempting to do.

ALSO: PyMongo does not support $regex commands in conditional statements.


Solution

  • I was able to better understand how to implement this idea.

    My mistake was placing the conditional statement in the update parameter. The $or function better works in the filter function of update_many. After the filter is satisfied, I then am able to update the AdTitle adequately.

    # Avengers Endgame
    col.update_many(
                    {'$or' :
                            [
                             {'text': { '$regex' : 'avengers', '$options' : 'i'}},
                             {'text': { '$regex' : 'Avengers', '$options' : 'i'}},
                             {'text': { '$regex' : 'Endgame', '$options' : 'i'}},
                             {'text': { '$regex' : 'endgame', '$options' : 'i'}},
                             {'text': { '$regex' : 'avengers endgame', '$options' : 'i'}},
                            ]
                    },
                      [{'$set':{'AdTitle':'Avengers Endgame'}}]
                     )