Search code examples
zipgoogle-colaboratoryunzip

Unzipping image directory in Google Colab doesn't unzip entire contents


I'm trying to unzip a directory of 75,000 images for training a CNN. When unzipping using,

!unzip -uq "/content/BDD_gt_8045.zip" -d "/content/drive/My Drive/Ground_Truth"

not all images unzip. I have about 5,000 I believe. I tried doing it several times but then I have some duplicates. Is there a limit to the number of images I can unzip?

I'm currently stuck on how else I'm meant to get all files into my drive to train the model.


Solution

  • Colab's default 'unzip' binary doesn't work as expected. It seems to cancel the unzipping automatically after a few cycles. Run latest version of 7z and you are good to go.

    # To extract with full paths
    !7z x <filename.zip>
    
    # To extract all the files in the same folder (ignore paths)
    !7z e <filename.zip>
    
    # To specify output directory, use '-o'
    !7z x <filename.zip> -o '/content/drive/My Drive/Datasets/FashionMNIST'
    

    enter image description here