Search code examples
pythonredditpraw

Getting all submissions for the past two months from a particular subreddit (using PRAW)?


I was trying to get all the /r/politics posts for the last two months with all the comments and the user details. How do I do this using PRAW?

Should I go through the posts in get_hot()? Any ideas on how to go about this? Are there any time-stamp methods that I can take advantage of?


Solution

  • You can use the Cloudsearch syntax to search between 2 timestamps. The syntax is:

    timestamp:START_UNIX_TIMESTAMP..END_UNIX_TIMESTAMP
    

    If you are expecting less than 1000 entries to be returned, it should be relatively straightforward just to set up a search to do this. Search queries are limited to 1000 requests, though, so some special logic is required if more posts than this are expected.

    To perform a search for any post in the last 2 months, try:

    import time
    current_timestamp = time.time()
    # 60 seconds * 60 minutes * 24 hours * 60 days = 2 months
    two_months_timestamp = current_timestamp - (60 * 60 * 24 * 60)
    query = 'timestamp:{}..{}'.format(current_timestamp, two_months_timestamp)
    results = reddit.subreddit('test').search(query, sort='new')
    

    If you need to get more than 1000, I would suggest getting the timestamp of the last item in the search results, then storing the timestamp of that and searching for timestamp:<current_timestamp>..<last_item_timestamp> and repeat until there are no more results.