Search code examples
opencvsvmnormalizationfeature-extractionscikit-image

Using cv2.COLOR_BGR2GRAY or color.rgb2gray for hog descriptor from skimage?


I want to train a SVM with hog features extracted by the hog descriptor from skimage. The images have 3 channels (RGB-images), which I want to transform to grayscale before extracting the hog features. And here is the problem, when I use the following code from OpenCV

img_gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

then I get features which are not normalized, that means the pixel values are still between 0 and 255.

When I use the code from Skimage

img_gray = color.rgb2gray(image)

then the values seems to be normalized, because the pixel values are approx. between 0 and 1.

I tried both versions for hog feature extraction and the results are similar but not the same and when I training SVMs then the results are also similar but not the same. When I train the SVM with the normalized images then the accuracy etc. is a little better, but not much.

When I look at the following link from skimage https://scikit-image.org/docs/dev/auto_examples/features_detection/plot_hog.html then I assume that the images do not need to be normalized before using the HOG descriptor, since the astronaut image in this link is also not normalized. Still, I find it confusing. Could you confirm or disagree with my assumption that it is better to use OpenCV's code than the code from skimage for transforming from rgb to gray?

The full code:

import cv2
img_gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY

or

from skimage import color
img_gray = color.rgb2gray(image)

before

from skimage import feature
feat = feature.hog(img_gray, orientations=12, pixels_per_cell=(5,5), cells_per_block=(2,2), transform_sqrt=True, visualize=False)

Solution

    1. Regarding grayscale input. If you use a grayscale image as input. The conversion to float or other-wise will not have much effect. This is because if you peek into the implementation of skimage. You will notice that if the input is grayscale, it is converted to a float image internally. See skimage hog implementation here

       if image.dtype.kind == 'u':
              # convert uint image to float
              # to avoid problems with subtracting unsigned numbers
              image = image.astype('float')
      
    2. Regarding the normalization. The reason this is improving your hog output is because normalization acts like a histogram equalization. Effectively stretching or compressing the dynamic range, so that your image content is enhanced. This is the reason you are observing this behaviour. It is expected.