Search code examples
c#zip.net-4.5sharpziplib

C#.net identify zip file


I am currently using the SharpZip api to handle my zip file entries. It works splendid for zipping and unzipping. Though, I am having trouble identifying if a file is a zip or not. I need to know if there is a way to detect if a file stream can be decompressed. Originally I used

FileStream lFileStreamIn = File.OpenRead(mSourceFile);
lZipFile = new ZipFile(lFileStreamIn);
ZipInputStream lZipStreamTester = new ZipInputStream(lFileStreamIn, mBufferSize);// not working
lZipStreamTester.Read(lBuffer, 0, 0);
if (lZipStreamTester.CanDecompressEntry)
{

The LZipStreamTester becomes null every time and the if statement fails. I tried it with/without a buffer. Can anybody give any insight as to why? I am aware that i can check for file extension. I need something that is more definitive than that. I am also aware that zip has a magic #(PK something), but it isn't a guarantee that it will always be there because it isn't a requirement of the format.

Also i read about .net 4.5 having native zip support so my project may migrate to that instead of sharpzip but I still need didn't see a method/param similar to CanDecompressEntry here: http://msdn.microsoft.com/en-us/library/3z72378a%28v=vs.110%29

My last resort will be to use a try catch and attempt an unzip on the file.


Solution

  • This is a base class for a component that needs to handle data that is either uncompressed, PKZIP compressed (sharpziplib) or GZip compressed (built in .net). Perhaps a bit more than you need but should get you going. This is an example of using @PhonicUK's suggestion to parse the header of the data stream. The derived classes you see in the little factory method handled the specifics of PKZip and GZip decompression.

    abstract class Expander
    {
        private const int ZIP_LEAD_BYTES = 0x04034b50;
        private const ushort GZIP_LEAD_BYTES = 0x8b1f;
    
        public abstract MemoryStream Expand(Stream stream); 
        
        internal static bool IsPkZipCompressedData(byte[] data)
        {
            Debug.Assert(data != null && data.Length >= 4);
            // if the first 4 bytes of the array are the ZIP signature then it is compressed data
            return (BitConverter.ToInt32(data, 0) == ZIP_LEAD_BYTES);
        }
    
        internal static bool IsGZipCompressedData(byte[] data)
        {
            Debug.Assert(data != null && data.Length >= 2);
            // if the first 2 bytes of the array are theG ZIP signature then it is compressed data;
            return (BitConverter.ToUInt16(data, 0) == GZIP_LEAD_BYTES);
        }
    
        public static bool IsCompressedData(byte[] data)
        {
            return IsPkZipCompressedData(data) || IsGZipCompressedData(data);
        }
    
        public static Expander GetExpander(Stream stream)
        {
            Debug.Assert(stream != null);
            Debug.Assert(stream.CanSeek);
            stream.Seek(0, 0);
    
            try
            {
                byte[] bytes = new byte[4];
    
                stream.Read(bytes, 0, 4);
    
                if (IsGZipCompressedData(bytes))
                    return new GZipExpander();
    
                if (IsPkZipCompressedData(bytes))
                    return new ZipExpander();
    
                return new NullExpander();
            }
            finally
            {
                stream.Seek(0, 0);  // set the stream back to the begining
            }
        }
    }