Search code examples
pythongmail-apimime-typeszlibmime

.zip file gets corrupted when sent with gmail api and compressed with zlib


I am using Python 3.7 and compressing a .csv file using python's zipfile and zlib.

import zipfile

filename = "report.csv"

zip_filename = f"{filename[:-4]}.zip"
with zipfile.ZipFile(zip_filename, "w", compression=zipfile.ZIP_DEFLATED) as zip:
    zip.write(filename)

The zip file is then attached to an email, I have some logic to determine its MIME type (I have checked that it correctly determines that it's application/zip):

def _make_attachment_part(self, filename: str) -> MIMEBase:
    content_type, encoding = mimetypes.guess_type(filename)
    if content_type is None or encoding is not None:
        content_type = "application/octet-stream"

    main_type, sub_type = content_type.split("/", 1)
    msg = MIMEBase(main_type, sub_type)
    with open(filename, "rb") as f:
        msg.set_payload(f.read())

    base_filename = os.path.basename(filename)
    msg.add_header("Content-Disposition", "attachment", filename=base_filename)

    return msg

Then, the subject, recipients, cc, attachments etc. are set for the message which is of the MIMEMultipart type. Then, I use base64 for encoding and send it through.

raw_message = base64.urlsafe_b64encode(message.as_bytes()).decode()

I receive my attachment named correctly and of the expected size, however, when I try to use unzip file.zip, I get the following error:

error [file.zip]:  missing 5 bytes in zipfile

Does anyone have any idea what I'm doing wrong? As a matter of fact, the email is sent from Ubuntu machine, whereas I'm trying to open the received file on MacOS.


Solution

  • As defined in the rfc1341:

    An encoding type of 7BIT requires that the body is already in a seven-bit mail- ready representation. This is the default value -- that is, "Content-Transfer-Encoding: 7BIT" is assumed if the Content-Transfer-Encoding header field is not present.

    In your case, in the _make_attachment_part function, you are setting the payload to your MIMEBase object, but you are not specifying the Content-Transfer-Encoding.

    I suggest that you encode your payload as Base64. You can do it as follows:

    1. Import the encoders module
    from email import encoders
    
    1. Inside your _make_attachment_part function, encode your payload using the encoders module.
    def _make_attachment_part(self, filename: str) -> MIMEBase:
        content_type, encoding = mimetypes.guess_type(filename)
        if content_type is None or encoding is not None:
            content_type = "application/octet-stream"
    
        main_type, sub_type = content_type.split("/", 1)
        msg = MIMEBase(main_type, sub_type)
        with open(filename, "rb") as f:
            msg.set_payload(f.read())
    
        encoders.encode_base64(msg) # NEW
    
        base_filename = os.path.basename(filename)
        msg.add_header("Content-Disposition", "attachment", filename=base_filename)
    
        return msg