Search code examples
pythoncomparegifresolution

Find similar image if resolution was changed


Im using python to check in hundred of websites if a web banner in GIF format exist. I have a folder with gif files as examples and Im compairing the gif files from each web sites with my own examples files. I use filecmp but I found that many webs compress the gif files so even the files are visually identical the filecmp wont detect as same.
Is there any python library to detect if two gif files or vídeos are similar even if the resolución has changed?


Solution

  • Comparing two images for similarity is a general image processing problem so the solution you develop can be as simple or complex as you want it to be. In your specific case, you'll need a method for making two images the same size and a method for comparing the images.

    First, you'll probably want to convert the images to RGB or grayscale arrays for comparison.

    I would suggest reducing the size of the larger image to the size of the smaller image. That is less likely to introduce artifacts than increasing the size of the smaller image. Resizing can be accomplished with the Python Pillow library.

    Image.resize(size, resample=None, box=None, reducing_gap=None)
    

    https://pillow.readthedocs.io/en/stable/reference/Image.html

    The resampling method may have some small effect on the similarity measure. However, you'll probably be fine just using resample = NEAREST.

    After making sure the images are the same size, they must be compared. One could compare them using mean squared error (MSE) or structural similarity (SSIM). Luckily, SSIM is already implemented in scikit-image.

    from skimage.metrics import structural_similarity as ssim
    
    s = ssim(imageA, imageB)
    

    https://www.pyimagesearch.com/2014/09/15/python-compare-two-images/?_ga=2.129476107.608689084.1632687977-376301729.1627364626

    In your case, MSE might work just as well. However, if the average brightness of the image had been changed by some kind of process, you'd want to first subtract that from each image.

    If resizing is the only issue, that should be it. If, however, the images may have been flipped, rotated, or cropped, additional steps might be necessary.