Search code examples
pythonarraysnumpyimage-processingrgb

Replacing RGB values in numpy array by integer is extremely slow


I want to replace the rgb values of a numpy array to single integer representations. My code works but it's too slow, I am iterating over every element right now. Can I speed this up? I am new to numpy.

from skimage import io

# dictionary of color codes for my rgb values
_color_codes = {
    (255, 200, 100): 1,
    (223, 219, 212): 2,
    ...
}

# get the corresponding color code for the rgb vector supplied
def replace_rgb_val(rgb_v):
    rgb_triple = (rgb_v[0], rgb_v[1], rgb_v[2])
    if rgb_triple in _color_codes:
        return _color_codes[rgb_triple]
    else:
        return -1

# function to replace, this is where I iterate
def img_array_to_single_val(arr):
    return np.array([[replace_rgb_val(arr[i][j]) for j in range(arr.shape[1])] for i in range(arr.shape[0])])


# my images are square so the shape of the array is (n,n,3)
# I want to change the arrays to (n,n,1)
img_arr = io.imread(filename)
# this takes from ~5-10 seconds, too slow!
result = img_array_to_single_val(img_arr)

Solution

  • Replace the color values the other way round. Look for each RGB-triple, and set the corresponding index in a new array:

    def img_array_to_single_val(arr, color_codes):
        result = numpy.ndarray(shape=arr.shape[:2], dtype=int)
        result[:,:] = -1
        for rgb, idx in color_codes.items():
            result[(arr==rgb).all(2)] = idx
        return result
    

    Let's take the color-index assignment apart: First arr==rgb compares each pixel-rgb-values with the list rgb, leading to a n x n x 3 - boolean array. Only if all three color-parts are the same, we found a match, so .all(2) reduces the last axis, leeding to a n x n - boolean array, with True for every pixel matching rgb. Last step is, to use this mask to set the index of the corresponding pixels.

    Even faster, it might be, to first convert the RGB-array to int32, and then do the index translation:

    def img_array_to_single_val(image, color_codes):
        image = image.dot(numpy.array([65536, 256, 1], dtype='int32'))
        result = numpy.ndarray(shape=image.shape, dtype=int)
        result[:,:] = -1
        for rgb, idx in color_codes.items():
            rgb = rgb[0] * 65536 + rgb[1] * 256 + rgb[2]
            result[arr==rgb] = idx
        return result
    

    For really large or many images you should first create a direct color mapping:

    color_map = numpy.ndarray(shape=(256*256*256), dtype='int32')
    color_map[:] = -1
    for rgb, idx in color_codes.items():
        rgb = rgb[0] * 65536 + rgb[1] * 256 + rgb[2]
        color_map[rgb] = idx
    
    def img_array_to_single_val(image, color_map):
        image = image.dot(numpy.array([65536, 256, 1], dtype='int32'))
        return color_map[image]