Search code examples
python-2.7image-processingpython-imaging-libraryimage-compression

Saving image twice using PIL


What happens if I save an image twice using PIL, with same image quality.

from PIL import Image

quality = 85

# Open original image and save 
img = Image.open('image.jpg')
img.save('test1.jpg', format='JPEG', quality=quality)

# Open the saved image and save again with same quality
img = Image.open('test1.jpg')
img.save('test2.jpg', format='JPEG', quality=quality)

There is almost no difference in the image size or the image quality.

Can I assume that saving an image multiple times with same quality does not affect the actual image quality and that it is a safe to do so?

Also, if I save an image with 85% quality and then open and save with 95% quality, the image size becomes much larger. Does that mean PIL decompresses the image and compresses it again?


Solution

  • In most cases your test1.jpg and test2.jpg images will be slightly different. Meaning, a loss of information stored in test1.jpg will hapen after you open (decompress) and save it (compress again) with lossy JPEG compression.

    In some cases however, opening and storing a JPEG image with same software will not introduce any changes.

    Take a look at this example:

    from PIL import Image
    import os
    import hashlib
    
    def md5sum(fn):
        hasher = hashlib.md5()
        with open(fn, 'rb') as f:
            hasher.update(f.read())
            return hasher.hexdigest()
    
    TMP_FILENAME = 'tmp.jpg'
    
    orig = Image.open(INPUT_IMAGE_FILENAME)
    orig.save(TMP_FILENAME)     # first JPG compression, standard quality
    
    d = set()
    for i in range(10000):
    
        # Compute file statistics
        file_size = os.stat(TMP_FILENAME).st_size
        md5 = md5sum(TMP_FILENAME)
        print ('Step {}, file size = {}, md5sum = {}'.format(i, file_size, md5))
        if md5 in d: break
        d.add(md5)
    
        # Decompress / compress
        im = Image.open(TMP_FILENAME)
        im.save(TMP_FILENAME, quality=95)
    

    It will open and save a JPG file repeatedly until a cycle is found (meaning an opened image has exactly the same data as occurred before).

    In my testing, it takes anywhere from 50 to 700 cycles to reach a steady state (when opening and saving image does not produce any loss). However, the final "steady" image is noticeably different from the original.

    Image after first JPG compression:

    Coke can lying on a beach in Ukraine

    Resulting "steady" image after 115 compress/decompress cycles:

    Coke can lying on a beach in Ukraine (faded colors, lost sharpness)

    Sample output:

    Step 0, file size = 38103, md5sum = ea28705015fe6e12b927296c53b6d147
    Step 1, file size = 71707, md5sum = f5366050780be7e9c52dd490e9e69316
    ...
    Step 113, file size = 70050, md5sum = 966aabe454aa8ec4fd57875bab7733da
    Step 114, file size = 70050, md5sum = 585ecdd66b138f76ffe58fe9db919ad7
    Step 115, file size = 70050, md5sum = 585ecdd66b138f76ffe58fe9db919ad7
    

    So even though I used a relatively high quality setting of 95, as you can see, multiple repeated compression/decompression made the image to lose its colors and sharpness. Even for quality setting of 100 the result will be very similar despite almost twice bigger file size.