Search code examples
pythonexceptionurllib2urllibcontinue

How to handle urllib2 socket timeouts?


So the following has worked for other links that have timed out and has continued to the next link in the loop. However for this link I got an error. I am not sure why that is and how to fix it so that when it happens it just browses to the next image.

try:
    image_file = urllib2.urlopen(submission.url, timeout = 5)
    with open('/home/mona/computer_vision/image_retrieval/images/'
              + category + '/'
              + datetime.datetime.now().strftime('%y-%m-%d-%s')
              + submission.url[-5:], 'wb') as output_image:
        output_image.write(image_file.read())
except urllib2.URLError as e:
    print(e)
    continue

The error is:

[LOG] Done Getting http://i.imgur.com/b6fhEkWh.jpg
submission id is: 1skepf
[LOG] Getting url:  http://www.redbubble.com/people/crtjer/works/11181520-bling-giraffe
[LOG] Getting url:  http://www.youtube.com/watch?v=Y7iuOZVJhs0
[LOG] Getting url:  http://imgur.com/8a62PST
[LOG] Getting url:  http://www.youtube.com/watch?v=DFZFiFCsTc8
[LOG] Getting url:  http://i.imgur.com/QPpOFVv.jpg
[LOG] Done Getting http://i.imgur.com/QPpOFVv.jpg
submission id is: 1f3amu
[LOG] Getting url:  http://25.media.tumblr.com/tumblr_lstla7vqK71ql5q9zo1_500.jpg
Traceback (most recent call last):
  File "download.py", line 50, in <module>
    image_file = urllib2.urlopen(submission.url, timeout = 5)
  File "/usr/lib/python2.7/urllib2.py", line 127, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 404, in open
    response = self._open(req, data)
  File "/usr/lib/python2.7/urllib2.py", line 422, in _open
    '_open', req)
  File "/usr/lib/python2.7/urllib2.py", line 382, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 1214, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "/usr/lib/python2.7/urllib2.py", line 1187, in do_open
    r = h.getresponse(buffering=True)
  File "/usr/lib/python2.7/httplib.py", line 1051, in getresponse
    response.begin()
  File "/usr/lib/python2.7/httplib.py", line 415, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python2.7/httplib.py", line 371, in _read_status
    line = self.fp.readline(_MAXLINE + 1)
  File "/usr/lib/python2.7/socket.py", line 476, in readline
    data = self._sock.recv(self._rbufsize)
socket.timeout: timed out

Solution

  • Explicitly catch the timeout exception: https://docs.python.org/3/library/socket.html#socket.timeout

    try:
        image_file = urllib2.urlopen(submission.url, timeout = 5)
    except urllib2.URLError as e:
        print(e)
        continue
    except socket.Timeouterror:
        print("timed out")
        # Your timeout handling code here...
    else:
        with open('/home/mona/computer_vision/image_retrieval/images/'+category+'/' + datetime.datetime.now().strftime('%y-%m-%d-%s') + submission.url[-5:], 'wb') as output_image:
            output_image.write(image_file.read())
    

    OP: Thanks! I had these thanks to your suggestion and my problem was solved for Python2.7:

    except socket.timeout as e:
        print(e)
        continue
    except socket.error as e:
        print(e)
        continue