I accidentally disconnected my internet connection and received this error below. However, why did this line trigger the error?
self.content += tuple(subreddit_posts)
Or perhaps I should ask, why did the following line not lead to a sys.exit
? It seems it should catch all errors:
try:
subreddit_posts = self.r.get_content(url, limit=10)
except:
print '*** Could not connect to Reddit.'
sys.exit()
Does this mean I am inadvertently hitting reddit's network twice?
FYI, praw is a reddit API client. And get_content()
fetches a subreddit's posts/submissons as a generator object.
The error message:
Traceback (most recent call last):
File "beam.py", line 49, in <module>
main()
File "beam.py", line 44, in main
scan.scanNSFW()
File "beam.py", line 37, in scanNSFW
map(self.getSub, self.nsfw)
File "beam.py", line 26, in getSub
self.content += tuple(subreddit_posts)
File "/Library/Python/2.7/site-packages/praw/__init__.py", line 504, in get_co
page_data = self.request_json(url, params=params)
File "/Library/Python/2.7/site-packages/praw/decorators.py", line 163, in wrap
return_value = function(reddit_session, *args, **kwargs)
File "/Library/Python/2.7/site-packages/praw/__init__.py", line 557, in reques
retry_on_error=retry_on_error)
File "/Library/Python/2.7/site-packages/praw/__init__.py", line 399, in _reque
_raise_response_exceptions(response)
File "/Library/Python/2.7/site-packages/praw/internal.py", line 178, in _raise
response.raise_for_status()
File "/Library/Python/2.7/site-packages/requests/models.py", line 831, in rais
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 503 Server Error: Service Unavailable
The script (it's short):
import sys, os, pprint, praw
class Scanner(object):
''' A scanner object. '''
def __init__(self):
self.user_agent = 'debian.22990.myapp'
self.r = praw.Reddit(user_agent=self.user_agent)
self.nsfw = ('funny', 'nsfw')
self.nsfw_posters = set()
self.content = ()
def getSub(self, subreddit):
''' Accepts a subreddit. Connects to subreddit and retrieves content.
Unpacks generator object containing content into tuple. '''
url = 'http://www.reddit.com/r/{sub}/'.format(sub=subreddit)
print 'Scanning:', subreddit
try:
subreddit_posts = self.r.get_content(url, limit=10)
except:
print '*** Could not connect to Reddit.'
sys.exit()
print 'Constructing list.',
self.content += tuple(subreddit_posts)
print 'Done.'
def addNSFWPoster(self, post):
print 'Parsing author and adding to posters.'
self.nsfw_posters.add(str(post.author))
def scanNSFW(self):
''' Scans all NSFW subreddits. Makes list of posters.'''
# Get content from all nsfw subreddits
print 'Executing map function.'
map(self.getSub, self.nsfw)
# Scan content and get authors
print 'Executing list comprehension.'
[self.addNSFWPoster(post) for post in self.content]
def main():
scan = Scanner()
scan.scanNSFW()
for i in scan.nsfw_posters:
print i
print len(scan.content)
main()
It looks like praw
is going to lazily get objects, so when you actually use subreddit_posts
is when the request gets made, which explains why it's blowing up on that line.
See: https://praw.readthedocs.org/en/v2.1.20/pages/lazy-loading.html