Search code examples
pythondjangocsvdjango-rest-frameworkzip

How can I append a CSV file to an existing zip archive in Python?


I am calling an API which returns a zip file back. Before sending this to the client for download, I want to append a csv file I'm creating to it. Below is how I am creating the CSV and my attempt at appending it, but I get what's seemingly an endless nested zip file that keeps appending the number 2 to the file name. It seems Mac OS is endlessly converting it from a zip to a cpgz and back again. If I try to run unzip on this file, I get the following error

End-of-central-directory signature not found. Either this file is not a zipfile, or it constitutes one disk of a multi-part archive. In the latter case the central directory and zipfile comment will be found on the last disk(s) of this archive.

Code to generate CSV in memory

transactions_csv = io.StringIO()
writer = csv.DictWriter(transactions_csv, fieldnames=all_parsed_transactions[0].keys())
writer.writeheader()
for transaction in all_parsed_transactions:
    writer.writerow(transaction)

return transactions_csv

Code attempting to append to existing zip

export = io.ByesIO(request.export_data()) #This is a zip response
transaction_csv = request.export_transactions() #This calls the code above

if transaction_csv is not None and export is not None:
    new_zip = zipfile.ZipFile(export, "a", zipfile.ZIP_DEFLATED)
    new_zip.write("test.csv", transaction_csv.getvalue())
    new_zip.close()

    return HttpResponse(new_zip, content_type='application/zip')

Solution

  • From the first look it seems that

    export = request.export_data()
    

    returned byte data. ZipFile only works with file-like objects, could you, please, try wrapping it up with:

    export = io.BytesIO(request.export_data())
    export.seek(0) # this sets cursor to the beginning of file
    # I had some issues with cursors not being at the beginning of the file 
    # and thus data read from it was corrupt.
    

    UPD: there are actually more problems than I described. Will be back shortly.

    Here is full working example:

    import csv
    import io
    import zipfile
    
    
    transactions = [
        {"f1": 1, "f2": 2, "f3": 3},
        {"f1": 3, "f2": 1, "f3": 2},
        {"f1": 2, "f2": 3, "f3": 1},
    ]
    
    
    transactions_csv = io.StringIO()
    writer = csv.DictWriter(transactions_csv, fieldnames=["f1", "f2", "f3"])
    writer.writeheader()
    for transaction in transactions:
        writer.writerow(transaction)
    transactions_csv.seek(0)
    
    with open("file.zip", "r+b") as z:
        new_zip = zipfile.ZipFile(z, "a", zipfile.ZIP_DEFLATED, False)
        new_zip.writestr("transaction3.csv", transactions_csv.read())
        new_zip.close()
    

    NOTE: I'm using file-like objects everywhere, assuming that this is one of the main problems. Code will be much simplier if you had your zip.file saved on disk.