I'm trying to get a program to work where I can input a list of image URLs and download all of them automatically to a folder. The problem arises when there's a dead link somewhere in the batch. Obviously, I don't want to go in and manually remove 1000+ dead links, so I just want to "skip" over them.
Here is what I have so far:
import pandas as pd
import urllib.request
import time
def url_to_jpg(i, url, file_path):
filename = 'image-{}.jpg'.format(i)
full_path = '{}{}'.format(file_path, filename)
urllib.request.urlretrieve(url, full_path)
print('{} saved.'.format(filename))
return None
FILENAME = 'images.csv'
FILE_PATH = 'images/'
urls = pd.read_csv(FILENAME)
while True:
try:
for i, url in enumerate(urls.values):
url_to_jpg(i, url[0], FILE_PATH);
except urllib.error.HTTPError:
continue
break
I am just a beginner, and that last part with checking for exceptions is the farthest I got.
Sorry for the messy code, I am just in a rush and have no time.
If you can spare the time, replace this code:
while True:
try:
for i, url in enumerate(urls.values):
url_to_jpg(i, url[0], FILE_PATH);
except urllib.error.HTTPError:
continue
break
with:
for i, url in enumerate(urls.values):
try:
url_to_jpg(i, url[0], FILE_PATH);
except urllib.error.HTTPError:
continue
Note that following a continue statement with a break statement at the same indentation level makes no sense, since the continue causes the program flow to jump back to the top of the loop. Your while True: loop doesn't actually do anything except to prevent your program from exiting.