Search code examples
c#compressionout-of-memorysharpziplib

Compress large file using SharpZipLib causing Out Of Memory Exception


I have a 453MB XML file which I'm trying to compress to a ZIP using SharpZipLib.

Below is the code I'm using to create the zip, but it's causing an OutOfMemoryException. This code successfully compresses a file of 428MB.

Any idea why the exception is happening, as I can't see why, as my system has plenty of memory available.

public void CompressFiles(List<string> pathnames, string zipPathname)
{
    try
    {
        using (FileStream stream = new FileStream(zipPathname, FileMode.Create, FileAccess.Write, FileShare.None))
        {
            using (ZipOutputStream stream2 = new ZipOutputStream(stream))
            {
                foreach (string str in pathnames)
                {
                    FileStream stream3 = new FileStream(str, FileMode.Open, FileAccess.Read, FileShare.Read);
                    byte[] buffer = new byte[stream3.Length];
                    try
                    {
                        if (stream3.Read(buffer, 0, buffer.Length) != buffer.Length)
                        {
                            throw new Exception(string.Format("Error reading '{0}'.", str));
                        }
                    }
                    finally
                    {
                        stream3.Close();
                    }
                    ZipEntry entry = new ZipEntry(Path.GetFileName(str));
                    stream2.PutNextEntry(entry);
                    stream2.Write(buffer, 0, buffer.Length);
                }
                stream2.Finish();
            }
        }
    }
    catch (Exception)
    {
        File.Delete(zipPathname);
        throw;
    }
}

Solution

  • You're trying to create a buffer as big as the file. Instead, make the buffer a fixed size, read some bytes into it, and write the number of read bytes into the zip file.

    Here's your code with a buffer of 4096 bytes (and some cleanup):

    public static void CompressFiles(List<string> pathnames, string zipPathname)
    {
        const int BufferSize = 4096;
        byte[] buffer = new byte[BufferSize];
    
        try
        {
            using (FileStream stream = new FileStream(zipPathname,
                FileMode.Create, FileAccess.Write, FileShare.None))
            using (ZipOutputStream stream2 = new ZipOutputStream(stream))
            {
                foreach (string str in pathnames)
                {
                    using (FileStream stream3 = new FileStream(str,
                        FileMode.Open, FileAccess.Read, FileShare.Read))
                    {
                        ZipEntry entry = new ZipEntry(Path.GetFileName(str));
                        stream2.PutNextEntry(entry);
    
                        int read;
                        while ((read = stream3.Read(buffer, 0, buffer.Length)) > 0)
                        {
                            stream2.Write(buffer, 0, read);
                        }
                    }
                }
                stream2.Finish();
            }
        }
        catch (Exception)
        {
            File.Delete(zipPathname);
            throw;
        }
    }
    

    Especially note this block:

    const int BufferSize = 4096;
    byte[] buffer = new byte[BufferSize];
    // ...
    int read;
    while ((read = stream3.Read(buffer, 0, buffer.Length)) > 0)
    {
        stream2.Write(buffer, 0, read);
    }
    

    This reads bytes into buffer. When there are no more bytes, the Read() method returns 0, so that's when we stop. When Read() succeeds, we can be sure there is some data in the buffer but we don't know how many bytes. The whole buffer might be filled, or just a small portion of it. Therefore, we use the number of read bytes read to determine how many bytes to write to the ZipOutputStream.

    That block of code, by the way, can be replaced by a simple statement that was added to .Net 4.0, which does exactly the same:

    stream3.CopyTo(stream2);
    

    So, your code could become:

    public static void CompressFiles(List<string> pathnames, string zipPathname)
    {
        try
        {
            using (FileStream stream = new FileStream(zipPathname,
                FileMode.Create, FileAccess.Write, FileShare.None))
            using (ZipOutputStream stream2 = new ZipOutputStream(stream))
            {
                foreach (string str in pathnames)
                {
                    using (FileStream stream3 = new FileStream(str,
                        FileMode.Open, FileAccess.Read, FileShare.Read))
                    {
                        ZipEntry entry = new ZipEntry(Path.GetFileName(str));
                        stream2.PutNextEntry(entry);
    
                        stream3.CopyTo(stream2);
                    }
                }
                stream2.Finish();
            }
        }
        catch (Exception)
        {
            File.Delete(zipPathname);
            throw;
        }
    }
    

    And now you know why you got the error, and how to use buffers.