Search code examples
c++opencvfeature-extractionfeature-detectionsobel

Significance of sobel's scale when searching Harris corners


For function cornerEigenValsVec in corner.cpp, I am stuck on understanding effects of local variable scale passing to Sobel(from line 257 to line 263):

int depth = src.depth();
double scale = (double)(1 << ((aperture_size > 0 ? aperture_size : 3) - 1)) * block_size;
if( aperture_size < 0 )
    scale *= 2.0;
if( depth == CV_8U )
    scale *= 255.0;
scale = 1.0/scale;

To my understanding, scale will be 1/(255*12) if src is of CV_8UC1. Applying 1/255 will normalize pixels' intensity to [0,1], but how about additional scale 1/12? What is its effect?


Solution

  • The 3x3 Sobel filter is obtained by matrix multiplying a derivative filter [-1 0 1] with a smoothing filter [1 2 1]. When the aperture becomes 5x5 another smoothing is applied to both the other filters. The "correct" normalization of these filters, i.e. the one which makes them add up to 1, should be 1/2 for the derivative and 1/4 for the smoothing. So a 3x3 filter should be normalized by 1/8, the 5x5 by 1/128 and the 7x7 by 1/2048. Calling r the aperture, the scaling should be: . More details can be found here.

    The code for this "should" be:

    double scale = 1 << (2 * aperture_size - 3);
    

    For some reason I really cannot fathom, OpenCV uses the normalization , which leads to:

    double scale = 1 << (aperture_size - 1);
    

    Meaning that the scaling is 1/4 for 3x3, 1/16 for 5x5, 1/64 for 7x7.

    The rest is easy to understand: if you want to use the Scharr filter, the aperture is set to CV_SCHARR, that is -1 (source), so the conditional operator is used to set aperture to 3. Strangely enough there is an if later which further multiplies everything by 2, which could have been incorporated in the conditional operator setting the value to 4. So the normalization for the Scharr filter is 1/8. Again I don't know why.

    Finally the block_size kicks in, but it's easy to understand: the squares of the gradients are later block summed, that is you are adding block_size*block_size elements. In order to normalize these you should divide by block_size*block_size. Scaling by block_size and then taking the square does the trick.