Search code examples
machine-learningcomputer-visionedge-detection

Is canny edge detection edge rotationlly invariant?


Suppose that the Canny edge detector successfully detects an edge in an image. The edge is then rotated by θ, where the relationship between a point on the original edge (x,y)(x,y) and a point on the rotated edge (x′,y′)(x′,y′) is defined as x′ = xcosθ; y′ = xsinθ;

Will the rotated edge be detected using the same Canny edge detector?

(I think we should find answer considering that the detection of an edge by the Canny edge detector depends only on the magnitude of its derivative.)


Solution

  • The answer is both yes and no, and which one you go for depends on how literally you take the question.

    First of all, we're dealing with a rectangular grid, so given an integer location (x,y), the corresponding point (x',y') in a rotated image is highly likely not an integer location. And considering that the output of Canny is a set of points, and not a smooth function that can be interpolated, it would be difficult to establish a correspondence between the set resulting from the rotated and the one resulting from the original image.

    Think for example about the number of pixels on a discrete line of a given length at 0 degrees and at 45 degrees. (Hint: the line at 45 degrees has sqrt(2) times fewer pixels.)

    But if you take the question more generally and interpret it as "will an edge that is detected in the original image also be detected after rotating the image by θ degrees?" then the answer is yes, in theory.

    Of course practice is always a bit different than theory. The details of the implementation matter here. And there is always numerical imprecision to contend with.

    Let's start by assuming the rotation is computed correctly, with a precise interpolation scheme (cubic, Lanczos) and not rounded after to uint8 or something (i.e. we're computing using floating-point values).

    If you read the original paper by Canny, you'll see he proposes using Gaussian derivatives as the best compromise between compact support and computational precision. I have seen few implementations that actually do. Typically I see a convolution with a Gaussian and then Sobel derivatives. Especially for smaller sigmas (less smoothing) the difference can be quite large. Gaussian derivatives are rotationally invariant, Sobel derivatives are not.

    The next step in the algorithm is non-maximum suppression. This is where the continuous gradient is converted to a set of points. For each pixel, it checks to see if it is a local maximum in the direction of the gradient. Because this is done per pixel, a different set of locations are tested in the rotated image compared to the original. Nonetheless, it should detect points along the same ridges in both cases.

    Next, a hysteresis threshold is applied. This is a two-threshold operation that keeps pixels above one threshold as long as at least one pixel above a second threshold is present in the same connected component. This is where the differences could occur between rotated and original image. Remember we're dealing with a set of pixels. We have samples the continuous gradient function at discrete points. There could be an edge that has one pixel above the second threshold in one version of the image, but not in the other. This would only occur for edges very close to the chosen threshold, of course.

    Next comes a thinning. Because the non-maximum suppression can yield points along a thicker line, a thinning operation is applied that removes pixels from the set that are not needed to maintain connectivity of the lines. Which pixels are selected here will also differ between rotated and original images, but this does not change the geometry of the solution, so we still have the same set of points.

    So, the answer is yes and no. :)

    Note that the same logic applies to translation.