Suppose we have a 2D array of gray-scale pixels. As we iterate through a large number of images, we find that each pixel has some level of correlation with all the others - across all images, certain subsets of pixels tend to be "on" at the same time as each other. How would I algorithmically rearrange the pixels' locations so that pixels which correlate with each other are also located near each other in the image?
Below is a visual (though maybe not technically accurate) depiction of what I want to do. Imagine that similarly-colored pixels are correlated across all images. We want to rearrange the pixels' locations so that these correlated pixels are co-located on the grid:
I have tried a genetic algorithm; the fitness function considered both euclidean distance between two pixels and their correlation. However it's too slow for my application, and I feel like their should be a more elegant approach.
Any thoughts would be appreciated.
Here is my formulation of your problem: You have a 2-dimensional array of vectors with correlations between these vectors. You want to rearrange the array so that vectors which are highly-correlated with each other are close together. The problem, presumably, is that any objective function which involves iterating over the entire array is too expensive to use with something like a genetic algorithm.
Here is one idea: Decide on a notion of the neighborhood of a cell (position in the array). I would go discrete rather than Euclidean since there is less computational overhead. Maybe the neighborhood of a cell is just itself and its immediate neighbors -- but you could go farther. The important thing is to keep the neighborhood small compared to the overall array. For each neighborhood -- calculate something like the average correlation between vectors in that neighborhood. Think of this as a local objective function. The overall objective function could be obtained by summing or averaging the values of these smaller objective functions.
Use some sort of hill-climbing approach (or maybe simulated annealing or tabu search) which involves making local changes on a single array rather than maintaining an entire population of arrays. Use local changes which consist of swapping two entries (and also experiment with such things as permuting triples of elements). A key insight is that such a local change only involves changing a handful of these local objective functions. In particular -- you can reject a move as being non-improving without having to recompute the overall objective function at all. Furthermore -- once you accept a move as improving, you can update the overall objective function without having to recompute very much.
I am vaguely uncomfortable with the idea of using correlation as the measure of similarity since correlation is signed. If you can't get good results with your present approach, perhaps rather than maximizing correlation you can try to minimize the squared distance between neighboring vectors.