In this code that uses zlib to encode some data, but with level=0 so it's not actually compressed:
import zlib
print('zlib.ZLIB_VERSION', zlib.ZLIB_VERSION)
total = 0
print('Total 1', total)
compress_obj = zlib.compressobj(level=0, memLevel=9, wbits=-zlib.MAX_WBITS)
total += len(compress_obj.compress(b'-' * 1000000))
print('Total 2', total)
total += len(compress_obj.flush())
print('Total 3', total)
Python 3.9.12 outputs
zlib.ZLIB_VERSION 1.2.12
Total 1 0
Total 2 983068
Total 3 1000080
but Python 3.10.6 (and Python 3.11.0) outputs
zlib.ZLIB_VERSION 1.2.13
Total 1 0
Total 2 1000080
Total 3 1000085
so both a different final size, and a different size along the way.
Why? And how can I get them to be identical? (I'm writing a library where I would prefer identical behaviour between Python versions)
zlib 1.2.12 and 1.2.13 behave identically in this regard. The Python library must be making different deflate()
calls with different amounts of data, and possibly introducing a flush in the later version. You can look in the Python source code to find out.
You should be able to force identical output if you feed smaller amounts of data to .compress()
each time, e.g. less than 64K-1, and use .flush()
after each. The output will be larger, but should be identical across versions.
A quick look turned up this commit, which is likely the culprit.