Search code examples
pythonwindowspython-2.7unzippython-zipfile

Python 2.7.x zipfile: slow unzip from network drive (Windows)


I have a small script that I wrote with a help of stackoverflow community to unzip the archive.

The strange issue that I'm currently facing is that large zip files (for example, 1GB or more) before unpacking are downloaded(?) locally to the computer and only after that starting to unpack.

My script is:

#!/usr/bin/env python2.7
# coding=utf-8

import os
import sys


def unpack_zip(zip_file, to_dir):

    if sys.platform in ('darwin', 'linux2'):
        unpack = os.system('unzip %s -d %s' % (zip_file, to_dir))
        if unpack != 0:
            return False
        return to_dir

    elif 'win32' in sys.platform:
        import zipfile
        zf = zipfile.ZipFile(zip_file, "r")

        if zf.testzip() is not None:
            return False

        try:
            os.mkdir(to_dir)
        except OSError:
            pass

        def get_members(zip_archive):
            parts = []
            for name in zip_archive.namelist():
                if not name.endswith('/'):
                    parts.append(name.split('/')[:-1])
            prefix = os.path.commonprefix(parts) or ''
            if prefix:
                prefix = '/'.join(prefix) + '/'
            offset = len(prefix)
            for zipinfo in zip_archive.infolist():
                name = zipinfo.filename
                if len(name) > offset:
                    zipinfo.filename = name[offset:]
                    print "Extracting: %s" % name
                    yield zipinfo
            
        zf.extractall(to_dir, get_members(zf))
        zf.close()

        return to_dir

if __name__ == "__main__":
    archive = os.path.join(os.getcwd(), "zip_file.zip")
    unzip_to = os.path.join(os.getcwd(), "test_unzip")
    unpack_zip(archive, unzip_to)

If you start this script it will wait for couple of minutes and only after that will begin extraction. Important note: zip file should be located at the network drive.

My goal is to start exctracion process immediately (similar to unzip tool in Linux / Mac). Is that possible to achieve without 3rd party dependencies (only with a help of ZipFile and Python)?


Solution

  • You are testing your zipped files before unpacking. The docstring to the testzip-method is clear: 'Read all the files and check the CRC.' Delete this line, and unpacking should start immediately.