Search code examples
pythonwith-statementtry-except

how to properly use try/except/with inside functions and main


I am a relative python newbie and I am getting confused with how to properly handle exceptions. Apologies for the dumb question.

In my main() I iterate through a list of dates and for each date I call a function, which downloads a csv file from a public web server. I want to properly catch exceptions for obvious reasons but especially because I do not know when the files of interest will be available for download. My program will execute as part of a cron job and will attempt to download these files every 3 hours if available.

What I want is to download the first file in the list of dates and if that results in a 404 then the program shouldn't proceed to the next file because the assumption is if the oldest date in the list is not available then none of the others that come after it will be available either.

I have the following python pseudo code. I have try/except blocks inside the function that attempts to download the files but if an exception occurred inside the function how do I properly handle it in the main() so I can make decisions whether to proceed to the next date or not. The reason why I created a function to perform the download is because I want to re-use that code later on in the same main() block for other file types.

def main():
...
...
# datelist is a list of date objects
    for date in datelist:
        download_file(date)

def download_file(date):
    date_string = str(date.year) + str(date.strftime('%m')) + str(date.strftime('%d'))
    request = HTTP_WEB_PREFIX+ date_string + FILE_SUFFIX
    try: 
        response = urllib2.urlopen(request)
    except urllib2.HTTPError, e:
        print "HTTPError = " + str(e)
    except urllib2.URLError, e:
        print "URLError = " + str(e)
    except httplib.HTTPException, e:
        print "HTTPException = " + str(e)  
    except IOError:
        print "IOError = " + str(e)
    except Exception:
        import traceback
        print "Generic exception: " + traceback.format_exc()
    else: 
        print "No problem downloading %s - continue..." % (response)
        try: 
            with open(TMP_DOWNLOAD_DIRECTORY + response, 'wb') as f:
        except IOError:
            print "IOError = " + str(e)
        else:
            f.write(response.read())
        f.close()

Solution

  • The key concept here is, if you can fix the problem, you should trap the exception; if you can't, it's the caller's problem to deal with. In this case, the downloader can't fix things if the file isn't there, so it should bubble up its exceptions to the caller; the caller should know to stop the loop if there's an exception.

    So let's move all the exception handling out of the function into the loop, and fix it so it craps out if there's a failure downloading the file, as the spec requires:

    for date in datelist:
            date_string = str(date.year) + 
                          str(date.strftime('%m')) + 
                          str(date.strftime('%d'))
        try:
            download_file(date_string)
        except:
            e = sys.exc_info()[0]
            print ( "Error downloading for date %s: %s" % (date_string, e) )
            break
    

    download_file should now, unless you want to put in retries or something like that, simply not trap the exceptions at all. Since you've decoded the date as you like in the caller, that code can come out of download_file as well, giving the much simpler

    def download_file(date_string):
        request = HTTP_WEB_PREFIX + date_string + FILE_SUFFIX
        response = urllib2.urlopen(request) 
        print "No problem downloading %s - continue..." % (response)
        with open(TMP_DOWNLOAD_DIRECTORY + response, 'wb') as f:
            f.write(response.read())
            f.close()
    

    I would suggest that the print statement is superfluous, but that if you really want it, using logger is a more flexible way forward, as that will allow you to turn it on or off as you prefer later by changing a config file instead of the code.