Search code examples
c#7zipsevenzipsharptext-compression

How to compress / decompress string with using SevenZip - 7Zip


There is so poor documentation so i am struggling to make this run

I added dll files and proper references

Finally it compresses string but giving error when i de-compress

Can you tell me error is where ?

    public static string SevenZip_CompressString(string text)
    {
        byte[] compressedData = null;

        SevenZipCompressor compressor = new SevenZipCompressor();
        compressor.CompressionMethod = CompressionMethod.Ppmd;
        compressor.CompressionLevel = SevenZip.CompressionLevel.Ultra;
        compressor.ScanOnlyWritable = true;
        compressor.DefaultItemName = "T";

        using (MemoryStream msin = new MemoryStream(Encoding.UTF8.GetBytes(text)))
        {
            using (MemoryStream msout = new MemoryStream())
            {
                compressor.CompressStream(msin, msout);

                compressedData = msout.ToArray();
            }
        }

        return System.Text.Encoding.UTF8.GetString(compressedData);
    }

Here below de-compress

    public static string SevenZip_DE_CompressString(string compressedText)
    {
        byte[] uncompressedbuffer = null;

        using (MemoryStream compressedbuffer = new MemoryStream(Encoding.UTF8.GetBytes(compressedText)))
        {
            using (SevenZipExtractor extractor = new SevenZipExtractor(compressedbuffer))
            {
                using (MemoryStream msout = new MemoryStream())
                {
                    extractor.ExtractFile(0, msout);
                    uncompressedbuffer = msout.ToArray();
                }
            }
        }

        return Encoding.UTF8.GetString(uncompressedbuffer);
    }

Here error message i get

c# .net 4.5 WPF ,

packages\SevenZipSharp.0.64\lib\SevenZipSharp.dll

enter image description here


Solution

  • These are wrong:

    System.Text.Encoding.UTF8.GetString(compressedData)
    Encoding.UTF8.GetBytes(compressedText)
    

    Compressed data isn't UTF-8. And you shouldn't try to treat it as text. Always store compressed data in binary, as a byte[]. If you need to pass it through a text-only channel, such as e-mail, use Base64 encoding.

    Fundamentally though, change your thinking. Compression is not a function string -> string. It's byte[] -> byte[]. It's also valid to consider it as string -> byte[].