Search code examples
c#gzipcompressiongzipstream

Concatenate gzipped byte arrays in C#


I have gzipped data which is stored in DB. Is there a way to concatenate say 50 separate gzipped data into one gzipped output which can be uncompressed? The result should be same as decompressing that 50 items, concatenating them and then gzipping them.

I would like to avoid decompression phase. Is there also some performance benefit of merging already gzipped data instead gzipping whole byte array?


Solution

  • Yes, you can concatenate gzip streams, which when decompressed give you the same thing as if you had concatenated the uncompressed data and gzipped it all at once. Specifically:

    gzip a
    gzip b
    cat a.gz b.gz > c.gz
    gunzip c.gz
    

    will give you the same c as:

    cat a b > c
    

    However compression will be degraded as compared to gzipping the whole thing at once, especially if each of your 50 pieces are small, e.g. less than several 10's of K bytes. The compressed result will always be different, and a little or a lot larger depending on the size of the pieces.

    The comment in another answer about GZIPStream should be heeded. I also recommend that you use DotNetZip instead.