Search code examples
pythonbytefilesizebmp

BMP file size encoding


I'm trying to understand how the image size is encoded in the bitmap file header. The wikipedia page only shows examples of images smaller than 255 bytes, hence the size is encoded in a single byte, followed by three 0 bytes.

The problem is that when I test out bigger images, I cannot relate the encoded bytes to the real size.

$ identify bmp1.bmp
bmp1.bmp BMP3 10x10 10x10+0+0 1-bit sRGB 2c 102B 0.000u 0:00.000
$ identify bmp2.bmp
bmp2.bmp BMP3 92x76 92x76+0+0 1-bit sRGB 2c 974B 0.000u 0:00.000
In [28]: [ord(c) for c in bmp1[2:6]]
Out[28]: [102, 0, 0, 0]

In [29]: len(bmp1)
Out[29]: 102

In [30]: [ord(c) for c in bmp2[2:6]]
Out[30]: [206, 3, 0, 0]

In [31]: len(bmp2)
Out[31]: 974

As you can see, the first image has 102 bytes, and the file header contains the size 102 encoded. But the second image has 974 bytes, and the file header contains the bytes 206 and 3. Is that because the file header size is not reliable, and I shouldn't try to read the size from those bytes ? Otherwise, how do you calculate 974 from 206 and 3 ?


Solution

  • It's an issue with endianness. BMP uses little-endian encoding, so the least significant byte is read first:

    102 = (102 * 2560) + (0 * 2561) + (0 * 2562) + (0 * 2563)

    = 102

    974 = (206 * 2560) + (3 * 2561) + (0 * 2562) + (0 * 2563)

    = 206 + (3 * 256)