Search code examples
pythongzipurllib

How can I create a GzipFile instance from the “file-like object” that urllib.urlopen() returns?


I’m playing around with the Stack Overflow API using Python. I’m trying to decode the gzipped responses that the API gives.

import urllib, gzip

url = urllib.urlopen('http://api.stackoverflow.com/1.0/badges/name')
gzip.GzipFile(fileobj=url).read()

According to the urllib2 documentation, urlopen “returns a file-like object”.

However, when I run read() on the GzipFile object I’ve created using it, I get this error:

AttributeError: addinfourl instance has no attribute 'tell'

As far as I can tell, this is coming from the object returned by urlopen.

It doesn’t appear to have seek either, as I get an error when I do this:

url.read()
url.seek(0)

What exactly is this object, and how do I create a functioning GzipFile instance from it?


Solution

  • The urlopen docs list the supported methods of the object that is returned. I recommend wrapping the object in another class that supports the methods that gzip expects.

    Other option: call the read method of the response object and put the result in a StringIO object (which should support all methods that gzip expects). This maybe a little more expensive though.

    E.g.

    import gzip
    import json
    import StringIO
    import urllib
    
    url = urllib.urlopen('http://api.stackoverflow.com/1.0/badges/name')
    url_f = StringIO.StringIO(url.read())
    g = gzip.GzipFile(fileobj=url_f)
    j = json.load(g)