Search code examples
pythonnumpyimage-processingpython-imaging-libraryantialiasing

Remove anti-aliasing from an image


I want to remove the antialiasing from an image. This code will get the 4 major colors from an image, compare each pixel to the 4 major colors and assign the closest color.

import numpy as np
from PIL import Image

image = Image.open('pattern_2.png')
image_nd = np.array(image)
image_colors = {}

for row in image_nd:
    for pxl in row:
        pxl = tuple(pxl)
        if not image_colors.get(pxl):
            image_colors[pxl] = 1
        else:
            image_colors[pxl] += 1

sorted_image_colors = sorted(image_colors, key=image_colors.get, reverse=True)
four_major_colors = sorted_image_colors[:4]


def closest(colors, color):
    colors = np.array(colors)
    color = np.array(color)
    distances = np.sqrt(np.sum((colors - color) ** 2, axis=1))
    index_of_smallest = np.where(distances == np.amin(distances))
    smallest_distance = colors[index_of_smallest]
    return smallest_distance[0]


for y, row in enumerate(image_nd):
    for x, pxl in enumerate(row):
        image_nd[y, x] = closest(four_major_colors, image_nd[y, x])

aliased = Image.fromarray(image_nd)
aliased.save("pattern_2_al.png")

This is the result:
enter image description here

As you can see, the borders between colors aren't perfect.

And this is the result I'm after:

enter image description here

(it seems the image hosting site compresses the image, and won't show "aliased" image properly)


Solution

  • The main problem here is located in your closest method:

    def closest(colors, color):
        colors = np.array(colors)
        color = np.array(color)
        distances = np.sqrt(np.sum((colors - color) ** 2, axis=1))
    

    Both colors and color become NumPy arrays of type uint8. Now, when subtracting uint8 values, you won't get negative values, but integer underflow will happen, resulting in values near 255. Therefore, the then calculated distances are wrong, which finally leads to the wrong color picking.

    So, the fastest fix would be to cast both variables to int32:

    def closest(colors, color):
        colors = np.array(colors).astype(np.int32)
        color = np.array(color).astype(np.int32)
        distances = np.sqrt(np.sum((colors - color) ** 2, axis=1))
    

    Also, it might be useful to make use of NumPy's vectorization power. Consider the following approach for your closest method:

    def closest(colors, image):
        colors = np.array(colors).astype(np.int32)
        image = image.astype(np.int32)
        distances = np.argmin(np.array([np.sqrt(np.sum((color - image) ** 2, axis=2)) for color in colors]), axis=0)
        return colors[distances].astype(np.uint8)
    

    So, instead of iterating all the pixels with

    for y in np.arange(image_nd.shape[0]):
        for x in np.arange(image_nd.shape[1]):
            image_nd[y, x] = closest(four_major_colors, image_nd[y, x])
    

    you can simply pass the whole image:

    image_nd = closest(four_major_colors, image_nd)
    

    Using the given image, I get a speed-up of 100x on my machine. Surely, finding the RGB histogram values can also be optimized. (Unfortunately, my experience with Python dictionaries isn't yet that great...)

    Anyway – hope that helps!