Search code examples
pythonzipshutilstringio

writing StringIO back to disk in python


I created an in-memory zip file using StringIO and zipfile:

inMemoryZip = StringIO()
outfile = zipfile.ZipFile(inMemoryZip, 'w', compression=zipfile.ZIP_DEFLATED)
//outfile.write(stuff)
inMemoryZip.seek(0)
return inMemoryZip

This data is uploaded to a server/database. At some point, it's retrieved and I need to write it to disk as a zip file. I'm not sure how to do this. I tried the following:

with open('pathToNewZip.zip', 'w') as archive:
  archive.write(inMemoryZip.getvalue())

This creates the zip file archive, but when I double click it, I can't see its contents, it basically creates a duplicate of itself (another identical .zip file, but the extension is .zip.cpgz) next to the original

I also tried:

with open('pathToNewZip.zip', 'w') as archive:
      shutil.copyfileobj(inMemoryZip, archive)

but its the same result as above.

I guess part of the problem is I don't know how to parse this data. The inMemoryZip (which is an instance of StringIO) could have multiple files in it.

Is each file written as a new row?

What does .getvalue() return? The data for all the files in the StringIO instance?

How do I determine the names of the files I zipped up previously so they can be re-created with the same name?

There's this: How can I pass a Python StringIO() object to a ZipFile(), or is it not supported? but it looks like that works if theres only 1 file being written at a time. my StringIO instance can have 1 file or 10 files.


Solution

  • You need to open the final file "wb" as suggested by @Kupiakos otherwise you will end up with a corrupted archive. Your other problem is that you need to close the zipfile before using the StringIO buffer. The close writes the file catalog to the zipfile. Without that, an unzipper may make assumptions such as this being only one of a multipart zip and do odd things. Nominally, just add outfile.close() or put it in a context manager (a with clause)

    def foo():
        inMemoryZip = StringIO()
        with zipfile.ZipFile(inMemoryZip, 'w', compression=zipfile.ZIP_DEFLATED) as output:
            //outfile.write(stuff)
        inMemoryZip.seek(0)
        return inMemoryZip
    

    then later

    with open('pathToNewZip.zip', 'wb') as archive:
      archive.write(inMemoryZip.getvalue())