Search code examples
pythoncachingurllib2

Caching in urllib2?


Is there an easy way to cache things when using urllib2 that I am over-looking, or do I have to roll my own?


Solution

  • You could use a decorator function such as:

    class cache(object):
        def __init__(self, fun):
            self.fun = fun
            self.cache = {}
    
        def __call__(self, *args, **kwargs):
            key  = str(args) + str(kwargs)
            try:
                return self.cache[key]
            except KeyError:
                self.cache[key] = rval = self.fun(*args, **kwargs)
                return rval
            except TypeError: # incase key isn't a valid key - don't cache
                return self.fun(*args, **kwargs)
    

    and define a function along the lines of:

    @cache
    def get_url_src(url):
        return urllib.urlopen(url).read()
    

    This is assuming you're not paying attention to HTTP Cache Controls, but just want to cache the page for the duration of the application