Search code examples
pythondirectoryzipshutil

Compressing directory using shutil.make_archive() while preserving directory structure


I'm trying to zip a directory called test_dicoms to a zip file named test_dicoms.zip using the following code:

shutil.make_archive('/home/code/test_dicoms', 'zip', '/home/code/test_dicoms')

The problem is that when I unzip it, all of the files that were in /test_dicoms/ are extracted to /home/code/ instead of the folder /test_dicoms/ and all of it's contained files being extracted to /home/code/. So /test_dicoms/ has a file called foo.txt and after I zip and unzip foo.txt's path is /home/code/foo.txt as opposed to /home/code/test_dicoms/foo.txt. How do I fix this? Also, some of the directories I'm working with are very large. Will I need to add anything to my code to make it ZIP64 or is the function smart enough to do that automatically?

Here's what's currently in the archive created:

[gwarner@jazz gwarner]$ unzip -l test_dicoms.zip
Archive: test_dicoms.zip
Length    Date       Time  Name
--------- ---------- ----- ----
    93324 09-17-2015 16:05 AAscout_b_000070
    93332 09-17-2015 16:05 AAscout_b_000125
    93332 09-17-2015 16:05 AAscout_b_000248

Solution

  • Using the terms in the documentation, you have specified a root_dir, but not a base_dir. Try specifying the base_dir like so:

    shutil.make_archive('/home/code/test_dicoms',
                        'zip',
                        '/home/code/',
                        'test_dicoms')
    

    To answer your second question, it depends upon the version of Python you are using. Starting from Python 3.4, ZIP64 extensions will be availble by default. Prior to Python 3.4, make_archive will not automatically create a file with ZIP64 extensions. If you are using an older version of Python and want ZIP64, you can invoke the underlying zipfile.ZipFile() directly.

    If you choose to use zipfile.ZipFile() directly, bypassing shutil.make_archive(), here is an example:

    import zipfile
    import os
    
    d = '/home/code/test_dicoms'
    
    os.chdir(os.path.dirname(d))
    with zipfile.ZipFile(d + '.zip',
                         "w",
                         zipfile.ZIP_DEFLATED,
                         allowZip64=True) as zf:
        for root, _, filenames in os.walk(os.path.basename(d)):
            for name in filenames:
                name = os.path.join(root, name)
                name = os.path.normpath(name)
                zf.write(name, name)
    

    Reference: