Search code examples
zipfile-format

How does one find the start of the "Central Directory" in zip files?


Wikipedia has an excellent description of the ZIP file format, but the "central directory" structure is confusing to me. Specifically this:

This ordering allows a ZIP file to be created in one pass, but it is usually decompressed by first reading the central directory at the end.

The problem is that even the trailing header for the central directory is variable length. How then, can someone get the start of the central directory to parse?

(Oh, and I did spend some time looking at APPNOTE.TXT in vain before coming here and asking :P)


Solution

  • My condolences, reading the wikipedia description gives me the very strong impression that you need to do a fair amount of guess + check work:

    Hunt backwards from the end for the 0x06054b50 end-of-directory tag, look forward 16 bytes to find the offset for the start-of-directory tag 0x02014b50, and hope that is it. You could do some sanity checks like looking for the comment length and comment string tags after the end-of-directory tag, but it sure feels like Zip decoders work because people don't put funny characters into their zip comments, filenames, and so forth. Based entirely on the wikipedia page, anyhow.