Search code examples
tiff

Writing TIFF pixel data last and making 8 GB TIFF files


Is there any reason not to write TIFF pixel data last? Typically a simple TIFF file starts with a header that describes endianness and contains the offset to the first IFD, then the pixel data, followed at the end of the file by the IFD and then the extra data that the IFD tags point to. All TIFF files I've seen are written in this order, however the TIFF standard says nothing about mandating such an order, in fact it says that "Compressed or uncompressed image data can be stored almost anywhere in a TIFF file", and I can't imagine that any TIFF parser would mind the order as they must follow the IFD offset and then follow the StripOffsets (tag 273) tags. I think it makes more sense to put the pixel data last so that one goes through a TIFF file more sequentially without jumping from the top to the bottom back to the top for no good reason, but yet I don't see anyone else doing this, which perplexes me slightly.

Part of the reason why I'm asking is that a client of mine tries to create TIFF files slightly over 4 GB which doesn't work due to the IFD offsets overflowing, and I'm thinking that even though the TIFF standard claims TIFF files cannot exceed 2^32 bytes there might be a way to create 8 GB TIFF files that would be accepted by most TIFF parsers if we put everything that isn't pixel data first so that they all have very small offsets, and then point to two strips of pixel data, a first strip followed by a second that would start before offset 2^32 and that is itself no larger than 2^32-1, thus giving us TIFF files that can be up to 2*(2^32-1) bytes and still be in theory readable despite being limited to 32 bit offsets and sizes.

Please note that this is a question about the TIFF format, I'm not talking about what any third-party library would accept to write as I wrote my own TIFF writing code, nor am I asking about BigTIFF.


Solution

  • It's OK to write pixel data after the IFD structure and tag data. Many TIFF writing software and libraries do that. Most notable, libtiff based software writes image data first. As for writing image data in huge strips, possibly extending the 4GB file size, check with the software/libraries you intend to read the files with. Software compiled for 32-bit or implementation details might prevent reading such files. I found that for example modern libtiff based software, Photoshop, and tifffile can read such files, while ImageJ, BioFormats, and Paint.NET can not.