Search code examples
pythonimage-processingcolorshistogram

How to find dominant color RGB using histograms?


I am currently developing a series of test code to explore different ways of processing image data, and one of my main topics is color extraction.

I have developed the following Python code, which, given an image, is capable of extracting its correspondent histograms with R, G and B values:

# Reading original image, in full color
img = mpimg.imread('/content/brandlogos/Images/14 (48).jpg')

# Displaying the image
imgplot = plt.imshow(img)
plt.show()

# Creating a tuple to select colors of each channel line
colors = ("r", "g", "b")
channel_ids = (0, 1, 2)

# Making an histogram with three lines, one for each color
# The X limit is 255, for RGB values
plt.xlim([0,256])
for channel_id, c in zip(channel_ids, colors):
  histogram, bin_edges = np.histogram(
      img[:, :, channel_id], bins = 256, range=(0,256)
  )
  plt.plot(bin_edges[0:-1], histogram, color=c )

# Displaying the histogram
plt.xlabel("Color Value")
plt.ylabel("Pixels")

plt.show()

Example:

Code Result

However, I now want to find the RGB value of the most dominant color, according to the histogram information.

Expected output: (87, 74, 163) (or something similar)

How would I go about finding the highest bin counts for the three color channels, combining them into a single color?


Solution

  • You can focus on the unique colors in your image, and find the one with the highest count:

    import matplotlib.image as mpimg
    import numpy as np
    
    img = np.uint8(mpimg.imread('path/to/your/image.png') * 255)
    img = np.reshape(img, (np.prod(img.shape[:2]), 3))
    res = np.unique(img, axis=0, return_counts=True)
    dom_color = res[0][np.argmax(res[1]), :]
    print(dom_color)
    

    Alternatively, you can use Pillow's getcolors, and get the one with the highest count:

    from PIL import Image
    
    img = Image.open('path/to/your/image.png')
    dom_color = sorted(img.getcolors(2 ** 24), reverse=True)[0][1]
    print(dom_color)
    

    Both also can easily provide the second, third, ... dominant colors, if needed.

    For some logo like this

    Logo

    the dominant color is given as :

    [208  16  18]
    

    respectively

    (208, 16, 18)
    
    ----------------------------------------
    System information
    ----------------------------------------
    Platform:      Windows-10-10.0.16299-SP0
    Python:        3.9.1
    PyCharm:       2021.1.1
    Matplotlib:    3.4.1
    NumPy:         1.20.2
    Pillow:        8.2.0
    ----------------------------------------
    

    EDIT: Maybe, to give an adversary for the histogram approach, as also explained by Cris in the comments, see this image:

    Adversary

    It's hard so see, but there are three different types of red, the largest rectangle has RGB values of (208, 0, 18).

    Now, let's compare your histogram approach with Pillow's getcolors:

    import numpy as np
    from PIL import Image
    
    adv = np.zeros((512, 512, 3), np.uint8)
    adv[:200, :200, :] = [208, 16, 18]
    adv[:200, 200:, :] = [208, 16, 0]
    adv[200:, :200, :] = [0, 16, 18]
    adv[200:, 200:, :] = [208, 0, 18]
    
    # Histogram approach
    dom_color = []
    for channel_id in [0, 1, 2]:
        dom_color.append(np.argmax(np.histogram(adv[..., channel_id], bins=256, range=(0, 256))[0]))
    print(dom_color)
    # [208, 16, 18]
    
    # Pillow's getcolors
    dom_color = sorted(Image.fromarray(adv).getcolors(2 ** 24), reverse=True)[0][1]
    print(dom_color)
    # (208, 0, 18)
    

    The adversary is build up to give histogram peaks at (208, 16, 18). Nevertheless, the dominant color is (208, 0, 18) as we can easily derive from the image (or the code).