I want to inflate a zlib compressed data. I've tried the following in python:
zlib.decompress(data)
-> it return the following error: zlib.error: Error -3 while decompressing data: incorrect data check
So I found a way to ignore data check:
def decompress_corrupted(data):
d = zlib.decompressobj(zlib.MAX_WBITS | 32)
f = BytesIO(data)
result_str = b''
buffer = f.read(1)
try:
while buffer:
result_str += d.decompress(buffer)
buffer = f.read(1)
except zlib.error:
pass
return result_str
But the result produced is partially "corrupted": I get a .rtf content with few mistakes.
My question: since I know that the compression uses zlib algorithm, what are the configurations parts (or pre/post-processes) I could try to get the original document?
Context: the solution used to compress these files is no more edited and the editor has never answered our messages. We only possess a compiled viewer but need the exact algorithm to make a migration to alternative solution. We know these files are not corrupted since the current viewer is able to display them properly.
If it can help:
There are no "configurations" needed. zlib's inflate will inflate any valid compressed zlib stream losslessly to the original content.
Therefore, despite your attestation, your data is getting corrupted or deliberately modified somewhere along the way.