F.W. This isn't just a PRAW question, it leans toward Python more than PRAW. Python people are welcome to contribute, and please note this is not my mother language xD!
Essentially, I'm writing a Reddit bot using the PRAW that does the following:
- Post by @dudeOne
- Comment by @dudeTwo
- Comment with "!completed" by @dudeOne
- Post by @dudeOne
- Comment by @dudeTwo
- Comment with "!completed" by @moderatorOne
print("Hey"), and:
- Post by @dudeOne
- Comment by @dudeOne
- Comment with "!completed" by @dudeOne
... does nothing, maybe even removes + messages @dudeOne.
Here's my messy code (xD):
import praw
import os
import re
sub = "RedditsQuests"
client_id = os.environ.get('client_id')
client_secret = os.environ.get('client_secret')
password = os.environ.get('pass')
reddit = praw.Reddit(client_id=client_id,
client_secret=client_secret,
password=password,
user_agent='r/RedditsQuests bot',
username='TheQuestMaster')
for submission in reddit.subreddit(sub).new(limit=None):
submission.comments.replace_more(limit=None)
if submission.saved is False:
for comment in submission.comments.list():
if ((("!completed" in comment.body)) and ((comment.is_submitter) or ('RedditsQuests' in comment.author.moderated())) and (comment.parent().author.name is not submission.author.name)):
print("etc...")
There's a decently-sized stack, so I've added it in this bin for your reference. To me it looks like PRAW is timing out because the if-in-for loop is taking too long. I could be wrong though!
The issue (as you've said) is somewhat sporadic but I've narrowed it down. As it turns out, trying to fetch the subreddits moderated by /u/AutoModerator will sometimes time out (presumably because the list is long).
Here's how I found the issue. Skip this section if you're only interested in the solution.
First, I modified your script to use try
and except
to catch the exception when it happened. Your traceback told me that it was happening on the line that starts with if ((("!completed" in comment.body))
, specifically when fetching the subreddits that a user moderates. Here was my modified script:
for submission in reddit.subreddit(sub).new(limit=None):
submission.comments.replace_more(limit=None)
if submission.saved is False:
for comment in submission.comments.list():
try:
if (
(("!completed" in comment.body))
and (
(comment.is_submitter)
or ("RedditsQuests" in comment.author.moderated())
)
and (comment.parent().author.name is not submission.author.name)
):
print("etc...")
except Exception:
print(f'Author: {comment.author} ({type(comment.author)})')
And the output:
etc...
etc...
Author: AutoModerator (<class 'praw.models.reddit.redditor.Redditor'>)
etc...
etc...
etc...
Author: AutoModerator (<class 'praw.models.reddit.redditor.Redditor'>)
etc...
etc...
etc...
etc...
etc...
etc...
etc...
Author: AutoModerator (<class 'praw.models.reddit.redditor.Redditor'>)
etc...
Author: AutoModerator (<class 'praw.models.reddit.redditor.Redditor'>)
etc...
etc...
With this in mind I wrote a very simple 3-line script to reproduce the issue:
import praw
reddit = praw.Reddit(...)
print(reddit.redditor("AutoModerator").moderated())
Sometimes this script would succeed but sometimes it would fail with the same socket read timeout. Presumably the timeout happens because AutoModerator moderates so many subreddits (at least 10,000), and the Reddit API takes too long to process the request.
Your script tries to determine whether the redditor in question is a moderator of the subreddit. You're doing this by checking if the subreddit is in the list of the user's moderated subreddits, but you can switch this to checking if the user is in the list of the subreddit's moderators. Not only should this not time out, but you'll be saving a lot of network requests because you can just fetch the list of moderators once.
The PRAW documentation of Subreddit
shows how we can get a list of moderators of a subreddit. In your case, we can do
moderators = list(reddit.subreddit(sub).moderator())
Then, instead of checking "RedditsQuests" in comment.author.moderated()
, we check
comment.author in moderators
Your code then becomes
import praw
import os
import re
sub = "RedditsQuests"
client_id = os.environ.get("client_id")
client_secret = os.environ.get("client_secret")
password = os.environ.get("pass")
reddit = praw.Reddit(
client_id=client_id,
client_secret=client_secret,
password=password,
user_agent="r/RedditsQuests bot",
username="TheQuestMaster",
)
moderators = list(reddit.subreddit(sub).moderator())
for submission in reddit.subreddit(sub).new(limit=None):
submission.comments.replace_more(limit=None)
if submission.saved is False:
for comment in submission.comments.list():
if (
(("!completed" in comment.body))
and ((comment.is_submitter) or (comment.author in moderators))
and (comment.parent().author.name is not submission.author.name)
):
print("etc...")
In my brief testing, this script runs many times faster, since we only get the list of moderators once, rather than fetching all subreddits moderated by all users who commented.
As an unrelated style note, instead of if submission.saved is False
you should do if not submission.saved
, which is the conventional way to check if a condition is false.