Search code examples
androidstringkotlincompressiongzip

Kotlin Gzip String not working as expected


Following is my code to compress the string

 val bos = ByteArrayOutputStream()
 GZIPOutputStream(bos).bufferedWriter(Charsets.UTF_8).use { it.write(data) }
 String(bos.toByteArray(),Charsets.UTF_8)

For input string <?xml version="1.0" encoding="UTF-8" standalone="no"?><VAST version="2.0"> I am getting this as output: �??????????????�����Q(K-*��ϳU2�3PRH�K�O��K�U

I am expecting output as below H4sIAAAAAAAAA7Oxr8jNUShLLSrOzM+zVTLUM1BSSM1Lzk/JzEu3VQoNcdO1UFIoLknMS0nMyc9LtVXKy1eyt7MJcwwOQegyAuqyAwB+wzaKSgAAAA== based on this site https://www.zickty.com/texttogzip

How can I compress string in known characters format using gzip as above?


Solution

  • H4sIAAAAAAAAA7Oxr8jNUShLLSrOzM+zVTLUM1BSSM1Lzk/JzEu3VQoNcdO1UFIoLknMS0nMyc9LtVXKy1eyt7MJcwwOQegyAuqyAwB+wzaKSgAAAA==

    This is encoded in Base64 encoding after the gzipping has already happened. (Note also that Base64 encoding will cost more memory than raw bytes, possibly costing you more memory than gzipping saved.)

    The output of your code is in raw bytes...mostly. Treating it as a UTF_8 string has likely corrupted it. You should either not convert it back to a string at all, or use Base64 to convert it to a string, with java.util.Base64.

    Note also, however, that different compression settings, even for the same zip algorithm, can result in different compressed output. The only reliable way of validating compression results is to decompress the results afterwards.