Search code examples
pythonnumpyscipyfftdft

Unexpected FFT Results with Python


I'm analyzing what is essentially a respiratory waveform, constructed in 3 different shapes (the data originates from MRI, so multiple echo times were used; see here if you'd like some quick background). Here are a couple of segments of two of the plotted waveforms for some context:

enter image description here

For each waveform, I'm trying to perform a DFT in order to discover the dominant frequency or frequencies of respiration.

My issue is that when I plot the DFTs that I perform, I get one of two things, dependent on the FFT library that I use. Furthermore, neither of them is representative of what I am expecting. I understand that data doesn't always look the way we want, but I clearly have waveforms present in my data, so I would expect a discrete Fourier transform to produce a frequency peak somewhere reasonable. For reference here, I would expect about 80 to 130 Hz.

My data is stored in a pandas data frame, with each echo time's data in a separate series. I'm applying the FFT function of choice (see the code below) and receiving different results for each of them.

SciPy (fftpack)

import pandas as pd
import scipy.fftpack

# temporary copy to maintain data structure
lead_pts_fft_df = lead_pts_df.copy()

# apply a discrete fast Fourier transform to each data series in the data frame
lead_pts_fft_df.magnitude = lead_pts_df.magnitude.apply(scipy.fftpack.fft)
lead_pts_fft_df

NumPy:

import pandas as pd
import numpy as np 

# temporary copy to maintain data structure
lead_pts_fft_df = lead_pts_df.copy()

# apply a discrete fast Fourier transform to each data series in the data frame
lead_pts_fft_df.magnitude = lead_pts_df.magnitude.apply(np.fft.fft)
lead_pts_fft_df

The rest of the relevant code:

ECHO_TIMES = [0.080, 0.200, 0.400] # milliseconds

f_s = 1 / (0.006) # 0.006 = time between samples
freq = np.linspace(0, 29556, 29556) * (f_s / 29556) # (29556 = length of data)

# generate subplots
fig, axes = plt.subplots(3, 1, figsize=(18, 16))

for idx in range(len(ECHO_TIMES)):
    axes[idx].plot(freq, lead_pts_fft_df.magnitude[idx].real, # real part
                   freq, lead_pts_fft_df.magnitude[idx].imag, # imaginary part

    axes[idx].legend() # apply legend to each set of axes

# show the plot
plt.show()

Post-DFT (SciPy fftpack):

enter image description here

Post-DFT (NumPy)

enter image description here

Here is a link to the dataset (on Dropbox) used to create these plots, and here is a link to the Jupyter Notebook.

EDIT:

I used the posted advice and took the magnitude (absolute value) of the data, and also plotted with a logarithmic y-axis. The new results are posted below. It appears that I have some wraparound in my signal. Am I not using the correct frequency scale? The updated code and plots are below.

# generate subplots
fig, axes = plt.subplots(3, 1, figsize=(18, 16))

for idx in range(len(ECHO_TIMES)):
    axes[idx].plot(freq[1::], np.log(np.abs(lead_pts_fft_df.magnitude[idx][1::])),
                   label=lead_pts_df.index[idx], # apply labels
                   color=plot_colors[idx]) # colors
    axes[idx].legend() # apply legend to each set of axes

# show the plot
plt.show()

enter image description here


Solution

  • I've figured out my issues.

    Cris Luengo was very helpful with this link, which helped me determine how to correctly scale my frequency bins and plot the DFT appropriately.

    ADDITIONALLY: I had forgotten to take into account the offset present in the signal. Not only does it cause issues with the huge peak at 0 Hz in the DFT, but it is also responsible for most of the noise in the transformed signal. I made use of scipy.signal.detrend() to eliminate this and got a very appropriate looking DFT.

    # import DFT and signal packages from SciPy
    import scipy.fftpack
    import scipy.signal
    
    # temporary copy to maintain data structure; original data frame is NOT changed due to ".copy()"
    lead_pts_fft_df = lead_pts_df.copy()
    
    # apply a discrete fast Fourier transform to each data series in the data frame AND detrend the signal
    lead_pts_fft_df.magnitude = lead_pts_fft_df.magnitude.apply(scipy.signal.detrend)
    lead_pts_fft_df.magnitude = lead_pts_fft_df.magnitude.apply(np.fft.fft)
    lead_pts_fft_df
    

    Arrange frequency bins accordingly:

    num_projections = 29556
    N = num_projections
    T = (6 * 3) / 1000 # 6*3 b/c of the nature of my signal: 1 pt for each waveform collected every third acquisition
    xf = np.linspace(0.0, 1.0 / (2.0 * T), num_projections / 2)
    

    Then plot:

    # generate subplots
    fig, axes = plt.subplots(3, 1, figsize=(18, 16))
    
    for idx in range(len(ECHO_TIMES)):
        axes[idx].plot(xf, 2.0/num_projections * np.abs(lead_pts_fft_df.magnitude[idx][:num_projections//2]),
                       label=lead_pts_df.index[idx], # apply labels
                       color=plot_colors[idx]) # colors
        axes[idx].legend() # apply legend to each set of axes
    
    # show the plot
    plt.show()
    

    enter image description here