Search code examples
aggregatepymongodate-conversion

How to convert twitter date using aggregation pipeline with pymongo?


I have a mongodb collection that I store tweets. I want to aggregate the number of tweets and the earliest date a person posted a tweet.

So far, I used this query:

pipe = [
       {"$group": {"_id": "$user.screen_name", "count": {"$sum": 1}, 
                  "minDate":{"$min":"$created_at"}}}]

list(collection_full.aggregate(pipeline=pipe))

But, the minimum is based on string so I always get Fridays because F comes first in the alphabet. I want to convert the "$created_at" to datetime.

Thanks.


Solution

  • I have found the answer after going through the documentation. I had to create a new variable in the pipeline.
    This script worked for me:

    pipe = [ { "$addFields": { "create_date": { "$dateFromString": {"dateString": "$created_at"} } } },
          { "$group": {"_id": "$user.screen_name", "count": {"$sum": 1}, "minDate":{"$min":"$create_date"}}}]   
    list(collection_full.aggregate(pipeline=pipe))