Search code examples
pythonpandasnumpyfft

Fast Fourier Transforms on existing dataframe is showing unexpexted results


I have a .csv file with voltage data, when I plot the data with time I can see that it is a sinusoidal wave with 60hz frequency.

Voltage data plot wrt time

Now when I try to perform fft using the scipy/numpy fft modules, I get a spike at near 0 frequency while logically it should be at 60. (shown below)

FFT output

When I tried it with a sin wave created in python I get proper results but I'm not getting it with my actual data.

I'm sharing my code below, please let me know if I am doing something wrong. Thanks in advance.

import csv
from matplotlib import pyplot as plt
import pandas as pd
import numpy as np
from scipy.fftpack import fft
from scipy.fftpack import fftfreq

df = pd.read_csv('Va_data.csv')

print(df.head())

N = df.shape[0]
frequency = np.linspace(0.0,100, int(N/2))
freq_data = fft(df['Va'])
y = (2/N)*np.abs(freq_data[0:np.int(N/2)])

plt.plot(frequency, y)
plt.title('Frequency Domain Signal')
plt.xlabel('Frequency in Hz')
plt.ylabel('Amplitude')
plt.show()

Voltage data


Solution

  • Data should be fine and FFT calculation (upto a constant) is fine too. It is about how the the results are plotted. To make the x-axis values represent the frequency information in terms of Hertz, you need

    frequency = np.arange(N) / N * sampling_rate
    

    and then you can crop the half of it

    frequency = frequency[:N//2]
    

    and give it to plt.plot(frequency, y). The equation for frequency above comes from the fact that each DFT coefficient X(k) for k = 0, .., N-1 has a exp(-j 2pi kn/N) in it where k/N gives you the normalized frequency. Multiplying by sampling rate recovers the frequency corresponding to the continous domain.

    A sample:

    # sample x data
    xs = np.linspace(0, 4, 1_000)
    
    # sampling rate in this case
    fs = 1 / np.diff(xs)[0]
    
    # sine of it
    ys = np.sin(2 * np.pi * 60 * xs)
    
    # taking FFT
    dft = np.fft.fft(ys)
    
    # getting x-axis values to represent freq in Hz
    N = len(xs)
    x_as_freq = np.arange(N) / N * fs
    
    # now plotting it
    plt.plot(x_as_freq, np.abs(dft))
    plt.xlabel("Frequency (Hz)")
    plt.ylabel("DFT magnitude")
    
    # to see that peak is indeed at 60Hz
    plt.xticks(np.arange(0, 250, 20))
    

    which gives

    fft