Search code examples
opencvimage-processingfftmagnitude

Why using magnitude method to get processed image?


Hi guys I’ve thinking about this question:

I know that we use Fourier transform to get into frequency domain to process the image.

I read the text book, it said that when we are done with processing the image in the Fourier domain we have to invert it back to get processed image.

And the textbook taught to get the real part of the inverse.

However, when I go through the OpenCv tutorial, no matter if using OpenCV or NumPy version, eventually they use magnitude (for OpenCV) or np.abs (for NumPy).

For OpenCV, the inverse returns two channels which contain the real and imaginary components. When I took the real part of the inverse, I got a totally weird image.

May somebody who knows the meaning behind all of this:

  1. Why using magnitude or abs to get processed image?

  2. What’s wrong with textbook instruction (take the real part of inverse)?


Solution

  • The textbook is right, the tutorial is wrong.

    A real-valued image has a complex conjugate symmetry in the Fourier domain. This means that the FFT of the image will have a specific symmetry. Any processing that you do must preserve this symmetry if you want the inverse transform to remain real-valued. If you do this processing wrong, then the inverse transform will be complex-valued, and probably non-sensical.

    If you preserve the symmetry in the Fourier domain properly, then the imaginary component of the inverse transform will be nearly zero (likely different from zero because of numerical imprecision). Discarding this imaginary component is the correct thing to do. Computing the magnitude will yield the same result, except all negative values will become positive (note some filters are meant to produce negative values, such as derivative filters), and at an increased computational cost.

    For example, a convolution is a multiplication in the Fourier domain. The filter in the Fourier domain must be real-valued and symmetric around the origin. Often people will confuse where the origin is in the Fourier domain, and multiply by a filter that is seems symmetric, but actually is shifted with respect to the origin making it not symmetric. This shift introduces a phase change of the inverse transform (see the shift property of the Fourier transform). The magnitude of the inverse transform is not affected by the phase change, so taking the magnitude of this inverse transform yields an output that sort of looks OK, except if one expects to see negative values in the filter result. It would have been better to correctly understand the FFT algorithm, create a properly symmetric filter in the Fourier domain, and simply keep the real part of the inverse transform.

    Nonetheless, some filters are specifically designed to break the symmetry and yield a complex-valued filter output. For example the Gabor filter has an even (symmetric) component and an odd (anti-symmetric) component. The even component yields a real-valued output, the odd component yields an imaginary-valued output. In this case, it is the magnitude of the complex value that is of interest. Likewise, a quadrature filter is specifically meant to produce a complex-valued output. From this output, the analytic signal (or its multi-dimensional extension, the monogenic signal), both the magnitude and the phase are of interest, for example as used in the phase congruency method of edge detection.


    Looking at the linked tutorial, it is the line

    fshift[crow-30:crow+30, ccol-30:ccol+30] = 0
    

    which generates the Fourier-domain filter and applies it to the image (it is equivalent to multiplying by a filter with 1s and 0s). This tutorial correctly computes the origin of the Fourier domain (though for Python 3 you would use crow,ccol = rows//2 , cols//2 to get the integer division). But the filter above is not symmetric around that origin. In Python, crow-30:crow+30 indicates 30 pixels to the left of the origin, and only 29 pixels to the right (the right bound is not included!). The correct filter would be:

    fshift[crow-30:crow+30+1, ccol-30:ccol+30+1] = 0
    

    With this filter, the inverse transform is purely real (imaginary component has values in the order of 1e-13, which is numerical errors). Thus, it is now possible (and correct) to replace img_back = np.abs(img_back) with img_back = np.real(img_back).