Search code examples
rcolorsmappingcluster-analysis

Mapping distances to colors


Assuming a matrix of distances between a number of samples, I would like to somehow reasonably map these distances to a color space. So for example if you have three apparent clusters, they should have different colors, and within a cluster you would have a number of shades of a color. However, I would like to avoid explicit clustering, if possible.

Clearly, the mapping cannot be perfect and universal: rather, it is a heuristic.

Is there a known algorithm for that? Or, perhaps, a ready solution for R?


Solution

  • Here is one possibility. No matter how many dimensions your original data was, you can use multi-dimensional scaling with the distance matrix to project the data to three dimensions, in a way that coarsely preserves distances. If you treat the three dimensions as R, G and B this will give a color scheme in which points that are close should have "close" colors.

    Here is a simple example. I generate some 5-dimensional data with 4 clusters (although no cluster analysis is performed). From that, we get the distance matrix. Then, as above we use multi-dimensional scaling to turn this into a color map. The points are plotted to show the result.

    ## Generate some sample data
    set.seed(1234)
    v = c(rnorm(80,0,1), rnorm(80,0,1), rnorm(80,4,1), rnorm(80,4,1)) 
    w = c(rnorm(80,0,1), rnorm(80,4,1), rnorm(80,0,1), rnorm(80,4,1)) 
    x = c(rnorm(80,0,1), rnorm(80,0,1), rnorm(80,4,1), rnorm(80,4,1)) 
    y = c(rnorm(80,0,1), rnorm(80,4,1), rnorm(80,0,1), rnorm(80,4,1)) 
    z = c(rnorm(80,0,1), rnorm(80,4,1), rnorm(80,-4,1), rnorm(80,8,1)) 
    df = data.frame(v,w,x,y,z)
    
    ## Distance matrix
    D = dist(df)
    
    ## Project to 3-dimensions
    PROJ3 = cmdscale(D, 3)
    
    ## Scale the three dimensions to [0,1] interval
    ScaledP3 = apply(PROJ3, 2, function(x) { (x - min(x))/(max(x)-min(x)) })
    colnames(ScaledP3) = c("red", "green", "blue")
    X = as.data.frame(ScaledP3)
    ## Convert to color map
    ColorMap = do.call(rgb, X)
    plot(x,y, pch=20, col=ColorMap)
    

    Points colored by Distance