Search code examples
pythonpython-3.xcsviogzip

Using csv.DictWriter to output an in-memory gzipped csv file?


I want to use a DictWriter from Python's csv module to generate a .csv file that's compressed using GZip. I need to do this all in-memory, so utilizing local files is out of the question.

However, I'm having trouble dealing with each module's type requirements in Python 3. Assuming that I got the general structure correctly, I can't make both modules work together because DictWriter needs to write to a io.StringIO buffer, while GZip needs a io.BytesIO object.

So, when I try to do:

buffer = io.BytesIO()
compressed = gzip.GzipFile(fileobj=buffer, mode='wb')
dict_writer = csv.DictWriter(buffer, ["a", "b"], extrasaction="ignore")

I get:

TypeError: a bytes-like object is required, not 'str'

And trying to use io.StringIO with GZip doesn't work either. How can I go about this?


Solution

  • A roundabout way would be to write it to a io.StringIO object first and then convert the content back to io.BytesIO:

    s = io.StringIO()
    b = io.BytesIO()
    
    dict_writer = csv.DictWriter(s, ["a", "b"], extrasaction="ignore")
    
    ... # complete your write operations ...
    
    s.seek(0)  # reset cursor to the beginning of the StringIO stream
    b.write(s.read().encode('utf-8')) # or an encoding of your choice
    
    compressed = gzip.GzipFile(fileobj=b, mode='wb')
    
    ... 
    
    s.close()   # Remember to close your streams!
    b.close()
    

    Though as @wwii's comment suggest, depending on the size of your data, perhaps it's more worthwhile to write your own csv in bytes instead.