Search code examples
pythonredditpraw

How can I make a list of the top comments in a subreddit with PRAW?


I need to grab the top comments in a subreddit, from all time.

I have tried grabbing all the submissions, and iterating through them, but unfortunately the amount of posts you can get is limited to 1000.

I have tried using Subreddit.get_comments, but it returns only 25 comments.

So I am looking for a way around that.

Can you help me out?


Solution

  • It is possible to use get_comments with a parameter of limit set to None to get all available comments. (By default, it uses the amount for the account, which is usually 25). (The parameters that are used for get_comments include the ones for get_content, including limit).

    However, this probably won't do what you want – get_comments (or more specifically /r/subreddit/comments) only offers a list of new comments or new gilded comments, not top comments. And since get_comments also capped to 1000 comments, you'll have trouble building a full list of top comments.

    So what you really want is the original algorithm – getting the list of top submissions and then the top comments of those. It's not the perfect system (a low-scoring post might actually have a highly voted comment), but it's the best possible.

    Here's some code:

    import praw
    
    r = praw.Reddit(user_agent='top_comment_test')
    subreddit = r.get_subreddit('opensource')
    top = subreddit.get_top(params={'t': 'all'}, limit=25) # For a more potentially accurate set of top comments, increase the limit (but it'll take longer)
    all_comments = []
    for submission in top: 
        submission_comments = praw.helpers.flatten_tree(submission.comments)
        #don't include non comment objects such as "morecomments"
        real_comments = [comment for comment in submission_comments if isinstance(comment, praw.objects.Comment)]
        all_comments += real_comments
    
    all_comments.sort(key=lambda comment: comment.score, reverse=True)
    
    top_comments = all_comments[:25] #top 25 comments
    
    print top_comments