Search code examples
pythonpython-3.xjpegpython-imaging-libraryimage-compression

Read jpg comression quality


Update:

After further reading I found out that the compression is not only not stored in the image header but also that the quality number is basically meaningless because different programs have a different approach to it. I'm still leaving this here in case anybody has a very clever solution to the problem.

I did write a log and have an approach writing the time I last compressed an image comparing it to the last modified time on the file, though I still would like to solve the problem of getting the quality for other images that I can't just get by time modified.

Original question:

I have a small Python script that walks trough certain folders and compresses certain .jpg files to 60 quality utilizing the pillow library.

It works however in a second iteration it would compress all the images that already were compressed.

So is there a way toe get the compression or quality the .jpg currently has so I can skip the file if it was already compressed?

import os
from os.path import join, getsize
from PIL import Image

start_directory = ".\\test"

for root, dirs, files in os.walk(start_directory):
    for f in files:

        try:
            with Image.open(join(root, f)) as im:
                if im.quality > 60: # <= something like this
                    im.save(join(root, f),optimize=True,quality=60)
            im.close() # <= not sure if necessary
        except IOError:
            pass

Solution

  • There is a way this might be solved. You could use bits per pixel to check your jpeg compression rate.

    As I understand it, the JPEG "quality" metric is a number from 0-100 where 100 represents lossless and 0 is maximum compression. As you say, the actual compression details will vary from encoder to encoder and (probably) from pixel block to pixel block (macroblocks?). Applying 60 to an image will reduce the image quality but it will also reduce the file size, presumably to 60% of its original size.

    Edit: the above paragraph is my interpretation of the Pillow documentation but do note that the Pillow documentation uses "quality" primarily as a parameter for saving images - there's no documentation that I can see for Pillow reading the "quality" parameter in from a file, so it is safe to assume that in this circumstance, the value of im.quality is likely meaningless (or, rather, completely dependent on the interpretation of encoder that last changed the JPEG quality).

    All your images will probably have different dimensions. What you want to look at is bits per pixel.

    The question is: why do you compress at quality factor 60 in the first place? Is it correct for all your images? Are you trying to achieve a specific file size? Or are you happy just to make them all smaller?

    If you, instead, aim for a specific number of bits per pixel, then your check just becomes a calculation of file size divided by number of pixels in the image and check this against your desired bits per pixel. If it’s bigger, apply compression.

    Of course, you’ll then have to be slightly more clever than selecting quality factor 60. Presumably 60 either means 60% of original file size or (more likely) it’s some internal setting. If it’s the former, you can calculate a new value simply enough. If it’s the latter, you may need to use trial and improvement to get the desired file size.