Search code examples
algorithmsortingdata-structurescolorsgeometry

Closest RGB color from list of colors


I was in an interview and I was asked to create an efficient algorithm to the following problem:

Input:

We have a List of RGB colors. Every color represented by 3 coordinate <x,y,z> between 0 to 255. This list never change.

We are getting every time some additional color (not necessarily from the list above) and we need to return the closest (in term of distance) color from the list to the additional color.

Notes:

  1. We can do some pre-processing on the colors's list because the list never change.

  2. Distance between colors defined as: d = ((x2 - x1)^2 + (y2 - y1)^2 + (z2 - z1)^2)^1/2

Example:

Let: List: {<1,1,1>,<1,0,1>,<2,2,2>,<0,1,0>} Additional color: <0,0,0>

Result: The minimal distance to <0,0,0> is <0,1,0>.

My attempt to solve this:

  1. Obviously we can do pre-processing and to keep all the color pairs in the world and to save the distance as well and we can get the solution in O(1) run time but with huge memory (255^3*n)

  2. The most naive solution is to loop over the list and to calculate the distance between every color from the list to the additional color and to return the color with the minimum distance. It's taking O(n) where n is the length of the color list.

I tried to maybe to do some sort the list with x,y,z coordinate and keeping 3 sorted list or sort by distance to <0,0,0> but I don't have a clue how to continue with it.

I also saw this but the question is not about the algorithmic approach to the problem.


Solution

  • Precomputing a full lookup table is not so unthinkable nowadays. As long as you have less than 256 reference colors, the required array has 256³ (not 255³) entries, which is 17 MB. (The double for 65536 colors.) But you really need good reasons to use such a device.

    If the number of colors is reasonable, the naive solution is quite acceptable.

    For larger number of colors (say from 10 up), several practical solutions are possible:

    • a kD-tree (where k=3); query time close to O(Log N);

    • an octree; query time close to O(Log N) if the colors are not clustered;

    • a grid (for instance 4096 cells of size 16); query time O(N), but in lucky cases, the asymptotic constant can be made very small.

    The tools of computational geometry might suggest a 3D Voronoi diagram combined to a locator in a 3D subdivision, but these are sophisticated and pretty hard to implement, even though they will support a guaranteed O(Log N) query time.

    I would personally favor a kD-tree.