python numpy scipy signal-processing fft

Filtering the two frequencies with highest amplitudes of a signal in the frequency domain

I have tried to filter the two frequencies which have the highest amplitudes. I am wondering if the result is correct, because the filtered signal seems less smooth than the original? Is it correct that the output of the FFT-function contains the fundamental frequency A0/ C0, and is it correct to include it in the search of the highest amplitude (it is indeed the highest!) ?

My code (based on my professors and collegues code, and I did not understand every detail so far):

# signal
data = np.loadtxt("profil.txt")
t = data[:,0]
x = data[:,1]
x = x-np.mean(x)           # Reduce signal to mean
n = len(t)
max_ind = int(n/2-1)       
dt = (t[n-1]-t[0])/(n-1)   
T =    n*dt                  
df = 1./T                  

# Fast-Fourier-Transformation
c     = 2.*np.absolute(fft(x))/n    #get the power sprectrum c from the array of complex numbers
c[0]  = c[0]/2.                     #correction for c0 (fundamental frequency)
f = np.fft.fftfreq(n, d=dt)       
a = fft(x).real
b = fft(x).imag
n_fft = len(a)

# filter 
p = np.ones(len(c))
p[c[0:int(len(c)/2)].argsort()[int(len(c)/2-1)]] = 0         #setting the positions of p to 0 with 
p[c[0:int(len(c)/2)].argsort()[int(len(c)/2-2)]] = 0         #the indices from the argsort function
print(c[0:int(len(c)/2-1)].argsort()[int(n_fft/2-2)])        #over the first half of the c array,  
ab_filter_2 = fft(x)                                        #because the second half contains the 
ab_filter_2.real = a*p                                      #negative frequencies.  
ab_filter_2.imag = b*p
x_filter2 = ifft(ab_filter_2)*2

I do not quite get the whole deal about FFT returning negative and positive frequencies. I know they are just mirrored, but then why can I not search over the whole array? And the iFFT function works with an array of just the positive frequencies?

The resulting plot: (blue original, red is filtered):

enter image description here

Solution

This part is very wasteful:

a = fft(x).real
b = fft(x).imag

You’re computing the FFT twice for no good reason. You compute it a 3rd time later, and you already computed it once before. You should compute it only once, not 4 times. The FFT is the most expensive part of your code.

Then:

ab_filter_2 = fft(x)
ab_filter_2.real = a*p
ab_filter_2.imag = b*p
x_filter2 = ifft(ab_filter_2)*2

Replace all of that with:

out = ifft(fft(x) * p)

Here you do the same thing twice:

p[c[0:int(len(c)/2)].argsort()[int(len(c)/2-1)]] = 0
p[c[0:int(len(c)/2)].argsort()[int(len(c)/2-2)]] = 0

But you set only the left half of the filter. It is important to make a symmetric filter. There are two locations where abs(f) has the same value (up to rounding errors!), there is a positive and a negative frequency that go together. Those two locations should have the same filter value (actually complex conjugate, but you have a real-valued filter so the difference doesn’t matter in this case).

I’m unsure what that indexing does anyway. I would split out the statement into shorter parts on separate lines for readability.

I would do it this way:

import numpy as np

x = ...
x -= np.mean(x)
fft_x = np.fft.fft(x)
c = np.abs(fft_x)  # no point in normalizing, doesn't change the order when sorting

f = c[0:len(c)//2].argsort()
f = f[-2:] # the last two elements are the indices to the largest two frequency components
p = np.zeros(len(c))
p[f] = 1   # preserve the largest two components
p[-f] = 1  # set the same components counting from the end

out = np.fft.ifft(fft_x * p).real
# note that np.fft.ifft(fft_x * p).imag is approximately zero if the filter is created correctly

Is it correct that the output of the FFT-function contains the fundamental frequency A0/ C0 […]?

In principle yes, but you subtracted the mean from the signal, effectively setting the fundamental frequency (DC component) to 0.