I have a 800x800 RGB bitmap, filesize is 2501 kilobyte, and do the following (using python 3.6):
(unfortunately i cannot share the image)
from PIL import Image
import numpy as np
im = Image.open('original_image.bmp')
im.save("test_size_manual.bmp", "BMP")
For some reason the new file is only 1876 KB. And even though the file size is different, the following holds:
import matplotlib.pylab as plt
original_image = plt.imread('original_image.bmp')
test_size_image = plt.imread('test_size_manual.bmp')
assert (original_image == test_size_image).all()
This means that pixel-for-pixel the resulting numpy.ndarray is the same. From a 'random' sampling of 800x800 bmp's found on google images most had the same file size as the new image, 1876 KB, but there also was at least one which had the same file size as the original image, 2501 KB.
What is causing this difference in filesize, or how would you go about finding out?
The answer is indeed found in the metadata.
The original image turns out to be a 32-bit bitmap and the new image is a 24-bit bitmap. This explains the difference in file size: 2501 * 3/4 is just under 1876.
At offset 28 (0x1c) of the binary the bit-depth is stored and for the original it was 32 and for the new image it was 24.
Reference: BMP file format on Wikipedia