I am currently reading about the deflate algorithm and as part of learning I picked one file that I zipped using different methods. What I found and what confuses me very much is that the different methods produced different bytes representing the compressed file.
I tried zipping the file using WinRar, 7-Zip, using the Java zlib library(ZipOutputStream
class) and also manually by just doing the deflate upon the source data(Deflater
class). All of the four methods produced completely different bytes.
My goal was just to see that all of the methods produced the same byte array as a result, but this was not the case and my question is why could that be? I made sure by checking the file headers that all of this software actually used the deflate algorithm.
Can anyone help with this? Is it possible that deflate algorithm can produce different compressed result for exactly the same source file?
There are many, many deflate representations of the same data. Surely you have already noticed that you can set a compression level. That could only have an effect if there were different ways to compress the same data. What you get depends on the compression level, any other compression settings, the software you are using, and the version of that software.
The only guarantee is that when you compress and then decompress, you get exactly what you started with. There is no guarantee, nor does there need to be or should be such a guarantee, that you get the same thing when you decompress and then compress.
Why do you have that goal?