I'm trying to detect a object's color in an efficient way. Let's assume I run a YOLO model and crop the object region given the bounding boxes. Given the cropped object image, then what's the most efficient and accurate way to detect the color of the object?
Previously, I trained a YOLO model to detect the color (10 class of colors), but running 2 deep learning models is too slow for my real-time requirements. I need the color detection/classification part to be very fast, preferably not using deep learning. Maybe pure Python or OpenCV or whatnot.
I wrote this piece of code that resizes the image to a 1x1 pixel. I then visualize the color in a square. But it's not accurate at all. Just too off.
from PIL import Image, ImageDraw
def get_dominant_color(pil_img):
img = pil_img.copy()
img = img.convert("RGBA")
img = img.resize((1, 1), resample=0)
dominant_color = img.getpixel((0, 0))
return dominant_color
# Specify the path to your image
image_path = "path/to/your/image.jpg"
# Open the image using PIL
image = Image.open(image_path)
# Get the dominant color
dominant_color = get_dominant_color(image)
# Print the color in RGB format
print("Dominant Color (RGB):", dominant_color[:3])
# Create a new image with a 100x100 square of the dominant color
square_size = 100
square_image = Image.new("RGB", (square_size, square_size), dominant_color[:3])
# Display the square image
square_image.show()
Let me provide a summary of all the following voluminous content of my answer in one single sentence:
There is no coding solution to expectations based on wrong assumptions.
In other words your question "How to detect car color efficiently?" is based on a not true assumption that it is possible to detect car color without further segmentation of the image and a deep analysis of entire areas of the single extracted segments along with the relationships between these areas.
If you want try to get a coding solution anyway, I suggest an "upside down" approach to what you want to achieve consisting of following steps:
As you arrive at tuning you will probably notice why A10 in the comment to your question speaks about a hard problem. Is a white car not a gray car at very cloudy weather short before the sun goes down?
To cover your color intuition you gain from analysis of all the image details and image content (you see in the image if the sun shines or if it is cloudy or dark and adjust your color criteria intuitively to this like taking for example thrown shadows and car window areas out of consideration) you will need much more than a fast and simple color comparison. In other words, you will probably realize that there is no way around training a learning model on gigabytes of images and no way around using faster hardware to achieve higher processing speed.
In order to dive a bit behind the reasons for your unrealistic expectations I suggest you read about the silver color for example at Wikipedia . Here an excerpt:
The visual sensation usually associated with the metal silver is its metallic shine. This cannot be reproduced by a simple solid color because the shiny effect is due to the material's brightness varying with the surface angle to the light source.
from which you can see, that it is impossible to identify a silver car color, because silver is not an R,G,B color of an image pixel, but the result of the overall impression of the entire color surface under the lighting conditions perceived from the context of all the details in the image.
Notice that considering the above revelation the actual purpose of the suggested step of:
- define the exact rgb() values of all car colors you want to detect in the images. For example, as you suggested in a comment, the rgb() values of 12 colors: red, white, black, silver, gray, yellow, orange, blue, pink, brown, beige and green.
is to make you aware of the limitations of the proposed approach and the fact that you won't be able to define silver, white and gray colors in a way which will go along with the with the human eye perceived car color.
By the way: it could be helpful to arrive at better results compared to the code you posted in your question cropping the images to get rid of the frames indicating the cars and then split the images into two halves, an upper and a lower one, and then decide about the result comparing the result for both of the images. Splitting to upper and lower parts can help to eliminate the impact of large dark car windows and detect inconsistencies in the color detection. Best would be to have a detection able to provide precise contour not only the rectangle - this would eliminate the noise because of the road surface and else surrounding. Depending on results from evaluating a huge amount of images you may decide to divide the images in vertical direction in three parts, not only two to improve the results. Notice also that effectivity of dividing the image into strips depends on the camera perspective, so for the white and yellow car simple vertical splitting will be sufficient, but in case of the silver and red car vertical splitting along with cropping a parallelogram will give better results. In other words the camera perspective should be a parameter evaluated by the method obtaining the car color.