Search code examples
pythonunzip

How do I write an if statement to check a zip file for a subdirectory and if that doesn't exist, creates a new directory?


So as it says in the title I am attempting to write a branching if/elif to check a zip file for a subdirectory to determine if I can just extract the subdirectory and files or if I need to create a new directory for it. The purpose of this is that I have a lot of comics and image archives, some of them only contain images and some of them contain one or multiple subdirectories. So I want to automate the unarchiving process, leaving all images in the exact same directory structure. The code I have so far looked like this:

if name.endswith('.zip'):
    
    zip_ref = zipfile.ZipFile(archive, 'r')

    # OS path splits the name away from the file extension and path variable
    new_folder = os.path.splitext(archive)[0][0:]

    zip_con = [zip_ref.namelist()]

    for f in zip_con[0]:
       if f.endswith('/'):
           try:
               zip_ref.extractall()
           except(OSError, IOError) as err:
               if err.errno != errno.EEXIST:
                   raise
           continue
    os.remove(archive)

This works for files with subdirectories, it extracts the subdirs and images, then deletes the original archive, but if it fails the if it automatically deletes an archive without extracting anything.

I attempted to add something like this:

else:
    os.mkdir(new_folder)
    zip_ref.extractall(new_folder)
    os.remove(archive)

and also:

elif not f.endswith('/'):
    os.mkdir(new_folder)
    zip_ref.extractall(new_folder)
    os.remove(archive)

But due to the for loop, it would always trigger it at least once, so if a zip file had a subdirectory it would end up giving me both. One copy of just the subdir and one copy of a directory with another copy of the subdir and images nested inside of it. Or it would simply fail as soon as it got to the elif and crash.

Edit: Directory and file information as well as clear example of current code results

So for testing purposes I have all of the zips in one folder, with the python script. This is currently the structure, note that "All-rounder_Meguru" zips do not contain a subdirectory but the "One Piece" zips do.

Zip structure enter image description here

Current Results from my code: Unarchived files enter image description here

This results in a duplication of the extraction and the full named One Piece folders (eg One Piece v06 (2005) (Digital) (AnHeroGold-Empire)) would also contain the "One Piece V06" subdirectory, with images nested inside that.

Current, added the makedirs just like from the example linked in the comments.

 #Path to the archive
        archive = os.path.abspath(path)

        if name.endswith('.zip') or name.endswith('.cbz') or name.endswith('.ZIP') or name.endswith('.CBZ'):

            zip_ref = zipfile.ZipFile(archive, 'r')

            # OS path splits the name away from the file extension and path variable
            new_folder = os.path.splitext(archive)[0][0:]

            zip_con = [zip_ref.namelist()]

            for f in zip_con[0]:
                if f.endswith('/'):
                    try:
                        zip_ref.extractall()
                    except(OSError, IOError) as err:
                        if err.errno != errno.EEXIST:
                            raise
                    continue
                else:
                    try:
                        os.makedirs(new_folder)
                        zip_ref.extractall(new_folder)
                    except OSError as exc:
                        if exc.errno == errno.EEXIST and os.path.isdir(new_folder):
                            pass
            os.remove(archive)

Solution

  • After some long rest and coming back to the problem I found it was just me not quite understanding how to use things in python like continue, break and StopIteration. Probably asked the wrong question too looking back.

    adding 'break' here fixed all my issues:

    try:
       zip_ref.extractall()
       break