Search code examples
filezipdiskfile-structure

What are "disks" in this context of the structure of ZIP files?


I'm currently working on a mini library for myself to compress and extract ZIP files. So far I don't have any major problems with the documentation, except that I don't get what "disks" are in a ZIP file and how to calculate the number of a disk:

4.3.16 End of central directory record:

end of central dir signature    4 bytes  (0x06054b50)

number of this disk             2 bytes   <= What does "disk" mean here?

number of the disk with the
start of the central directory  2 bytes   <= What does "disk" mean here?

total number of entries in the
central directory on this disk  2 bytes   <= What does "disk" mean here?

total number of entries in
the central directory           2 bytes

size of the central directory   4 bytes

offset of start of central
directory with respect to
the starting disk number        4 bytes   <= What does "disk" mean here?

.ZIP file comment length        2 bytes

.ZIP file comment       (variable size)

Solution

  • The term disk refers to floppy diskettes in the context of splitting and spanning ZIP files (see chapter 8.0 of the documentation that you provided; emphasis mine):

    8.1.1 Spanning is the process of segmenting a ZIP file across multiple removable media. This support has typically only been provided for DOS formatted floppy diskettes.

    Nowadays, implementations often no longer support splitting and spanning (e.g. here (see Limitations) or here (see procedure finish_zip())), as floppy disks or even compact disks went out of fashion. If you are fine with not supporting splitting and spanning (at first), you can set the values as follows:

    number of this disk             2 bytes   <=  You only have one disk/file, so set it to 1.
    
    number of the disk with the
    start of the central directory  2 bytes   <=  You only have one disk/file, so set it to 1.
    
    total number of entries in the
    central directory on this disk  2 bytes   <=  Set it to the overall number of records.
    
    offset of start of central
    directory with respect to
    the starting disk number        4 bytes   <=  Set this offset (in bytes) relative to
                                                  the start of your archive.
    

    If you want to support splitting or spanning, then you have to increase the disk count every time you start writing to a new disk/file. Reset the total number of entries in the central directory on this disk for each new disk/file. Calculate the offset relative to the start of the file.