Search code examples
pythonmacoszip

Remove auto-generated __MACOSX folder from inside a zip file in Python


I have zip files uploaded by clients through a web server that sometimes contain pesky __MACOSX directories inside that gum things up. How can I remove these?

I thought of using ZipFile, but this answer says that isn't possible and gives this suggestion:

Read out the rest of the archive and write it to a new zip file.

How can I do this with ZipFile? Another Python based alternative like shutil or something similar would also be fine.


Solution

  • The examples below are designed to determine if a '__MACOSX' file is contained within a zip file. If this pesky exist then a new zip archive is created and all the files that are not __MACOSX files are written to this new archive. This code can be extended to include .ds_store files. Please let me if you need to delete the old zip file and replace it with the new clean zip file.

    Hopefully, these answers help you solve your issue.

    Example One

    from zipfile import ZipFile
    
    original_zip = ZipFile ('original.zip', 'r')
    new_zip = ZipFile ('new_archve.zip', 'w')
    for item in original_zip.infolist():
       buffer = original_zip.read(item.filename)
       if not str(item.filename).startswith('__MACOSX/'):
         new_zip.writestr(item, buffer)
      new_zip.close()
    original_zip.close()
    

    Example Two

    def check_archive_for_bad_filename(file):
      zip_file = ZipFile(file, 'r')
      for filename in zip_file.namelist():
         print(filename)
         if filename.startswith('__MACOSX/'):
            return True
    
    def remove_bad_filename_from_archive(original_file, temporary_file):
       zip_file = ZipFile(original_file, 'r')
       for item in zip_file.namelist():
          buffer = zip_file.read(item)
          if not item.startswith('__MACOSX/'):
            if not os.path.exists(temporary_file):
               new_zip = ZipFile(temporary_file, 'w')
               new_zip.writestr(item, buffer)
               new_zip.close()
             else:
               append_zip = ZipFile(temporary_file, 'a')
               append_zip.writestr(item, buffer)
               append_zip.close()
    
        zip_file.close()
    
    
    archive_filename = 'old.zip'
    temp_filename = 'new.zip'
    
    results = check_archive_for_bad_filename(archive_filename)
    if results:
       print('Removing MACOSX file from archive.')
       remove_bad_filename_from_archive(archive_filename, temp_filename)
    else:
       print('No MACOSX file in archive.')