I'm writing my own unzip code, and (from trial and error, no understanding) it looks like the CRC-32 algorithm on the one byte that decryption requires doesn't quite match up with zlib's. To convert from one to the other:
def crc32(ch, crc):
crc = zlib.crc32(bytes([~ch & 0xFF]), crc)
return (~crc & 0xFF000000) + (crc & 0x00FFFFFF)
Why is this? (/ Am I wrong?)
Edit: the reason why I think there is at least the possibility of me being right, at https://github.com/uktrade/stream-unzip/blob/d23400028abbe3b0d7e1951cb562cd0541bfc960/stream_unzip.py#L89 I use the above successfully to decrypt encrypted ZIP files
def decrypt(chunks):
key_0 = 305419896
key_1 = 591751049
key_2 = 878082192
def crc32(ch, crc):
crc = zlib.crc32(bytes([~ch & 0xFF]), crc)
return (~crc & 0xFF000000) + (crc & 0x00FFFFFF)
def update_keys(byte):
nonlocal key_0, key_1, key_2
key_0 = crc32(byte, key_0)
key_1 = (key_1 + (key_0 & 0xFF)) & 0xFFFFFFFF
key_1 = ((key_1 * 134775813) + 1) & 0xFFFFFFFF
key_2 = crc32(key_1 >> 24, key_2)
def decrypt(chunk):
chunk = bytearray(chunk)
for i, byte in enumerate(chunk):
temp = key_2 | 2
byte ^= ((temp * (temp ^ 1)) >> 8) & 0xFF
update_keys(byte)
chunk[i] = byte
return chunk
yield_all, _, get_num, _ = get_byte_readers(chunks)
for byte in password:
update_keys(byte)
if decrypt(get_num(12))[11] != mod_time >> 8:
raise ValueError('Incorrect password')
for chunk in yield_all():
yield decrypt(chunk)
However, if I replace the crc32
function above with just calling zlib's, it doesn't (e.g. it will complain about an incorrect password)
Ok, you're not completely wrong. It is indeed the same CRC-32 algorithm, but without the pre and post-processing (inverting the CRC coming in and going out). It is truly odd code that is trying to replicate that with the zlib.crc32
function. All you need is this:
def crc32(ch, crc):
return ~zlib.crc32(bytes([ch]), ~crc) & 0xffffffff