Search code examples
pythonpraw

Python reddit API: efficiently parse all comments in a subreddit


I am trying to code a chatbot and to have it scanning through all the comments added to it.

Currently I do so by scanning every X seconds to the last Y comments:

handle = praw.Reddit(username=config.username,
                    password=config.password,
                    client_id=config.client_id,
                    client_secret=config.client_secret,
                    user_agent="cristiano corrector v0.1a")
while True:
    last_comments = handle.subreddit(subreddit).comments(limit=Y)
    for comment in last_comments:
        #process comments
    time.sleep(X)

I am quite unsatisfied as there can be a lot of overlap (which can be solved by tracking the comments id) and some comments are scanned twice while others are ignored. Is there a better way of doing so with this API?


Solution

  • I found a solution making use of stream inside the PRAW API. Details in https://praw.readthedocs.io/en/latest/tutorials/reply_bot.html

    And in my code:

    handle = praw.Reddit(username=config.username,
                        password=config.password,
                        client_id=config.client_id,
                        client_secret=config.client_secret,
                        user_agent="cristiano corrector v0.1a")
    
    for comment in handle.subreddit(subreddit).stream.comments():
        #process comments
    

    This should save some CPU and network load.