Search code examples
pythonpython-imaging-library

How can I scale an image with PIL without ruining its appearence?


I noticed that simply multiplying and dividing an array even for coefficents equivalent to 1, will deform the image.

I need to rescale the image pixels because I need to feed them to a ML model, but I noticed there seems to be a huge loss of information in the process.

This is the original image (an example):

Image.fromarray((np.array(out_img.resize((224, 224)))),'L')

Original image

If I divide it by 255, it somehow ends up like this:

Image.fromarray((np.array(out_img.resize((224, 224)))/255),'L')

enter image description here

A lot of information seems lost, and apparently I can't revert back to the original:

(np.array(out_img.resize((224, 224)))/255*255==np.array(out_img.resize((224, 224)))).all()    
Image.fromarray((np.array(out_img.resize((224, 224)))/255*255),'L')

enter image description here If you see I checked that multiplying and dividing by 255 will give us back the same array, but the images look different.

The same happens even if I naively divide and multiply by 1:

Image.fromarray((np.array(out_img.resize((224, 224)))*(1/1)),'L')

enter image description here

Is there an explanation for this behaviour or a way to prevent the information loss?


Solution

  • But you can't create a 'L' mode PIL image from a float array. PIL.Image.fromarray just plug the data from the passed array into an image. Those data, with L mode, are supposed to be bytes. And what you gave are floats.

    See following example

    from PIL import Image
    import numpy as np
    img = np.array([[3.141592653589793, 12, 13, 14], [15, 16, 17, 18]])
    limg=np.array(Image.fromarray(img, 'L'))
    

    limg is now an array of the same shape as img. Since PIL image built from img has this resolution. But data are 8 bytes (since there are 8 pixels, and we said that format is L), that are the 8 first bytes taken from img.

    See

    img.tobytes()
    # b'\x18-DT\xfb!\t@\x00\x00\x00\x00\x00\x00(@\x00\x00\x00\x00\x00\x00*@\x00\x00\x00\x00\x00\x00,@\x00\x00\x00\x00\x00\x00.@\x00\x00\x00\x00\x00\x000@\x00\x00\x00\x00\x00\x001@\x00\x00\x00\x00\x00\x002@'
    limg.tobytes()
    # b'\x18-DT\xfb!\t@'
    

    We can even try to decode that

    import struct
    limg
    # array([[ 24,  45,  68,  84],
    #       [251,  33,   9,  64]], dtype=uint8)
    struct.pack('BBBBBBBB', 24, 45, 68, 84, 251, 33, 9, 64)
    # b'\x18-DT\xfb!\t@'
    # See, it is the same things. Just the bytes of limg, that is the 8 1st bytes of img, shown as uint8
    
    struct.unpack('d', struct.pack('BBBBBBBB', 24, 45, 68, 84, 251, 33, 9, 64))
    # (3.141592653589793,)
    # See, it is just the 8 bytes of float representation of the first float in img (the 7 other are lost
    

    So, the image you have here are image of the bytes of the float data (of the 1st 8th of the float data, since there is no room for more). Each group of 8 pixels are the 8 bytes of a float.

    Same occurs for any operation that turn the ndarray of uint8 into a ndarray of float. Including multiplying by (1/1).

    Solution

    I don't know what ML model you use. I doubt it requires PIL images. So, you could pass it ndarray. Including floats if needed.

    If you really need to use PIL image, then you could use 'F' mode instead of 'L' (which means 8 bits grayscale).

    Note that if you just hadn't passed 'L' argument to fromarray, it would have guessed by itself the mode (grayscale because of the H×W shape — not H×W×3 that would be RGB, or H×W×2 that would be LA,... — 'F' because of the float dtype)

    Also note that your question has nothing to do with scaling. You would have had the exact same problem without any resize. Image.fromarray(np.array(img)*1.0, 'L') would have the same problem. This is not a scaling quality problem. It is an image format, even a data format, problem; you are using memory that contains floats and ask PIL to interpret it as if it were containing uint8.