Search code examples
pythonmongodbpymongo

Filter pymongo query based on timedelta for two fields for each document


I am using MongoDB and Python to work with a collection where each document looks like this:

{_id: 'e82d0b5c-5e38-4442-a678-f29245681238',
start_time: '2020-02-25T20:45:10.623Z',
stop_time: '2020-02-26T21:45:10.623Z',
data: {cost: 3, buyer: 'John'}}

I would like to find all documents in my collection for which the difference between start_time and stop_time is superior to 24 hours. One potential issue I considered is that the format for start_time and stop_time is String, and not Date, but I am able to generate a cursor using:

db['collection'].find({'stop_time':{'$gte': 'start_time'}})

But I cannot generate a cursor using any of the following:

db['collection'].find({'stop_time':{'$gte': 'start_time' + datetime.timedelta(hours=24)}})
TypeError: can only concatenate str (not "datetime.timedelta") to str
db['collection'].find({'stop_time':{'$gte': {'$toDate': 'start_time'} + datetime.timedelta(hours=24)}})
TypeError: unsupported operand type(s) for +: 'dict' and 'datetime.timedelta'

I apologize if this is found to be a duplicate, but I haven't managed to find a post solving my problem. Thank you very much for your time!


Solution

  • Here is a solution with an aggregate

    [
      {
        "$addFields": {
          "date_diff": {
            "$subtract": [
              {
                "$dateFromString": {
                  "dateString": "$stop_time"
                }
              },
              {
                "$dateFromString": {
                  "dateString": "$start_time"
                }
              }
            ]
          }
        }
      },
      {
        "$match": {
          "date_diff": {
            "$gte": 86400000 //24*3600*1000
          }
        }
      }
    ]
    

    I first use $addFields, to create a new field that will contain the difference between the two dates. To calculate that difference I use $subtract, but before I convert the string to a date format with $dateFromString

    Then we can filter only the documents that have a difference greater or equal to 24 hours (or 86400000 milliseconds) with a $match.

    try it here