Search code examples
computer-visionrgbvision

Applying normalization to RGB images and getting RGB images as output


My question is very short and very naïve but I found very different answers on Internet. What is the most effective (and actually used in the Computer Vision community) way of normalizing an RGB image.
This question comes from the fact that algorithm like PCA or even contrast normalization are described often in their 2D-versions. Therefore for whitening/global contrast normalization or whatever methods you like to preprocess images to feed to a statistical method of your liking: do you consider each channel separately or do you reshape the depth-3 thing into a rectangular 2D-array (of depth-1) (and how to do that while preserving structure) do your thing and then split it back to its former shape ?

I think each method has its advantages considering the image as a whole seems more meaningful but applying to each channel separately is more simple.


Solution

  • I will make my answer specific to ZCA whitening but I guess it is the same for others:
    As the input of the PCA has the shape of a 2D-matrix with (nsamplesxfeatures) dimension. I thought of using the RGB channels as nsamples and the image in those channel flattened as features.
    The answer seems to be to use nsamples as nsamples (the numbers of images you have if you have several RGB images) and to use the RGB-image completely flattened as features.
    This answer leads me to believe that if you want to normalise an image you should use the general mean of the image and general standard deviation and not to consider each channel separately. If somebody disagrees he is free to comment, I agree that my question was a bit too broad.