Search code examples
huffman-codelibjpeglibjpeg-turbo

Why is "no code allowed to be all ones" in libjpeg's Huffman decoding?


I'm trying to satisfy myself that METEOSAT images I'm getting from their FTP server are actually valid images. My doubt arises because all the tools I've used so far complain about "Bogus Huffman table definition" - yet when I simply comment out that error message, the image appears quite plausible (a greyscale segment of the Earth's disc).

From https://github.com/libjpeg-turbo/libjpeg-turbo/blob/jpeg-8d/jdhuff.c#L379:

while (huffsize[p]) {
  while (((int) huffsize[p]) == si) {
    huffcode[p++] = code;
    code++;
  }
  /* code is now 1 more than the last code used for codelength si; but
   * it must still fit in si bits, since no code is allowed to be all ones.
   */
  if (((INT32) code) >= (((INT32) 1) << si))
    ERREXIT(cinfo, JERR_BAD_HUFF_TABLE);
  code <<= 1;
  si++;
}

If I simply comment out the check, or add a check for huffsize[p] to be nonzero (as in the containing loop's controlling expression), then djpeg manages to convert the image to a BMP which I can view with few problems.

Why does the comment claim that all-ones codes are not allowed?


Solution

  • It claims that because they are not allowed. That doesn't mean that there can't be images out there that don't comply with the standard.

    The reason they are not allowed is this (from the standard):

    Making entropy-coded segments an integer number of bytes is performed as follows: for Huffman coding, 1-bits are used, if necessary, to pad the end of the compressed data to complete the final byte of a segment.

    If the all 1's code was allowed, then you could end up with an ambiguity in the last byte of compressed data where the padded 1's could be another coded symbol.