Search code examples
c#.netzip

C# and Zip file manipulation


Here is what I am looking for:

I need to open a zip file of images and iterate through it's contents. First of all, the zip container file has subdirectories and inside one "IDX" houses the images I need. I have no problem extracting the zip file contents to a directory. My zip files can be incredibly huge, as in GBs huge, and so I am hoping to be able to open the file and pull out the images as I iterate through them one at a time to process them.

After I am done I just close the zip file. These images are actually being housed in a database.

Does anyone have any idea how to do this with, hopefully, free tools or built-in api's? This process will be done on a Windows machine.

Thanks!


Solution

  • SharpZipLib is a great tool for your requirements.

    I have used it to process giant files within directories within giant nested zip files (meaning ZIP files within ZIP files), using streams. I was able to open a zip stream on top of a zip stream so that I could investigate the contents of the inner zip without having to extract the entire parent. You can then use a stream to peek at the content files, which may help you determine whether you want to extract it or not. It's open-source.

    EDIT: Directory handling in the library is not ideal. As I recall, it contains separate entries for some directories, while others are implied by the paths of the file entries.

    Here's an extract of the code I used to collect the actual file and folder names at a certain level (_startPath). Let me know if you're interested in the whole wrapper class.

    // _zipFile = your ZipFile instance
    List<string> _folderNames = new List<string>();
    List<string> _fileNames = nwe List<string>();
    string _startPath = "";
    const string PATH_SEPARATOR = "/";
    
    foreach ( ZipEntry entry in _zipFile )
    {
        string name = entry.Name;
    
        if ( _startPath != "" )
        {
            if ( name.StartsWith( _startPath + PATH_SEPARATOR ) )
                name = name.Substring( _startPath.Length + 1 );
            else
                continue;
        }
    
        // Ignore items below this folder
        if ( name.IndexOf( PATH_SEPARATOR ) != name.LastIndexOf( PATH_SEPARATOR ) )
            continue;
    
        string thisPath = null;
        string thisFile = null;
    
        if ( entry.IsDirectory ) {
            thisPath = name.TrimEnd( PATH_SEPARATOR.ToCharArray() );
        }
        else if ( entry.IsFile )
        {
            if ( name.Contains( PATH_SEPARATOR ) )
                thisPath = name.Substring( 0, name.IndexOf( PATH_SEPARATOR ) );
            else
                thisFile = name;
        }
    
        if ( !string.IsNullOrEmpty( thisPath ) && !_folderNames.Contains( thisPath ) )
            _folderNames.Add( thisPath );
    
        if ( !string.IsNullOrEmpty( thisFile ) && !_fileNames.Contains( thisFile ) )
            _fileNames.Add( thisFile );
    }