Search code examples
sizejpeg

Determining the size of a JPEG (JFIF) image


I need to find the size of a JPEG (JFIF) image. The image is not saved as a stand-alone file, so I can't use GetFileSize or any other API such this one (the image is placed in a stream and no other header is present, except the usual JPEG/JFIF header(s)).

I did some research and found out that JPEG images are composed of different parts, each part starting with a frame marker (0xFF 0xXX), and the size of this frame. Using this information I was able to parse a lot of information from the file.

The problem is, I cannot find the size of the compressed data, as it seems there is no frame marker for the compressed data. Also, it seems the compressed data follows the SOS (FFDA) marker and the image ends with the End Of Image (EOI) (FFD9) marker.

A way to accomplish this would be to search for the EOI marker from byte to byte, but I think the compressed data might contain this combination of bytes, right?

Is there an easy and correct way to find the total size of the image? (I would prefer some code/idea without any external library)

Basically, I need the distance (in bytes) between the Start of Image (SOI-FFE0) and End of Image (EOI-FFD9).


Solution

  • The compressed data will not include SOI or EOI bytes, so you are safe there. But the comment, application data, or other headers might. Fortunately, you can identify and skip these sections as the length is given.

    The JPEG specification tells you what you need:
    http://www.w3.org/Graphics/JPEG/itu-t81.pdf

    Look at Table B.1, on page 32. The symbols that have an * do not have a length field following it (RST, SOI, EOI, TEM). The others do.

    You will need to skip over the various fields, but it is not too bad.

    How to go through:

    1. Start reading SOI (FFD8). This is the start. It should be the first thing in the stream.

      • Then, progress through the file, finding more markers and skipping over the headers:

      • SOI marker (FFD8): Corrupted image. You should have found an EOI already!

      • TEM (FF01): standalone marker, keep going.

      • RST (FFD0 through FFD7): standalone marker, keep going. You could validate that the restart markers count up from FFD0 through FFD7 and repeat, but that is not necessary for measuring the length.

      • EOI marker (FFD9): You're done!

      • Any marker that is not RST, SOI, EOI, TEM (FF01 through FFFE, minus the exceptions above): After the marker, read the next 2 bytes, this is the 16-bit big-endian length of that frame header (not including the 2-byte marker, but including the length field). Skip the given amount (typically length minus 2, since you already got those bytes).

      • If you get an end-of-file before EOI, then you've got a corrupted image.

      • Once you've got an EOI, you've gotten through the JPEG and should have the length. You can start again by reading another SOI if you expect more than one JPEG in your stream.