Search code examples
pythonnumpyopencvpraw

In python, how do I get urllib to recognize multiple lines in a string as separate URLs?


I'm very new to code so forgive any errors I make in explanation! I'm trying to write code on python that uses Praw to access the /r/pics subreddit, scrape the source urls and display them with urllib, cv2 and numpy.

Currently my code looks like this:

import praw
import numpy as np
import urllib
import cv2

# urllib set-up
def reddit_scrape(url):
    resp = urllib.request.urlopen(url)
    image = np.asarray(bytearray(resp.read()), dtype="uint8")
    image = cv2.imdecode(image, cv2.IMREAD_COLOR)
    return image

# reddit set-up
reddit = praw.Reddit(client_id = 'id',
                     client_secret = 'secret',
                     user_agent = 'agent')

subreddit = reddit.subreddit('pics')
hot_pics = subreddit.hot(limit=10)

for submission in hot_pics:
    if not submission.stickied:
        print(submission.url)

# print images  
urls = [submission.url]
for url in urls:
    image = reddit_scrape(url)
    cv2.imshow('image', image)
    cv2.waitKey(0)

My problem when I run this is that although the print(submission.url) line prints a full list of the top 10 posts, only the last url on the list is actually opened and displayed.

My guess is that the error lies somewhere in my definition of

urls = [submission.url]

But I can't define 'urls' to be a static list of urls, because the hot list changes over time.

What am I doing wrong? is there even a right way to do this? Any help would be greatly appreciated.


Solution

  • submission is whatever the last submission was at the end of your for loop. Instead of constructing urls outside the loop, so when you say urls = [submission.url] you're only getting the last url. Instead you should create a list and append them:

    urls = []
    for submission in hot_pics:
        if not submission.stickied:
            urls.append(submission.url)
    

    Or even the more Pythonic:

    urls = [submission.url for submission in hot_pics if not submission.stickied]
    

    Then the for url in urls will loop through all the appended urls.