I have a .csv
file with voltage data, when I plot the data with time I can see that it is a sinusoidal wave with 60hz
frequency.
Now when I try to perform fft
using the scipy/numpy fft
modules, I get a spike at near 0
frequency while logically it should be at 60
. (shown below)
When I tried it with a sin wave created in python I get proper results but I'm not getting it with my actual data.
I'm sharing my code below, please let me know if I am doing something wrong. Thanks in advance.
import csv
from matplotlib import pyplot as plt
import pandas as pd
import numpy as np
from scipy.fftpack import fft
from scipy.fftpack import fftfreq
df = pd.read_csv('Va_data.csv')
print(df.head())
N = df.shape[0]
frequency = np.linspace(0.0,100, int(N/2))
freq_data = fft(df['Va'])
y = (2/N)*np.abs(freq_data[0:np.int(N/2)])
plt.plot(frequency, y)
plt.title('Frequency Domain Signal')
plt.xlabel('Frequency in Hz')
plt.ylabel('Amplitude')
plt.show()
Data should be fine and FFT calculation (upto a constant) is fine too. It is about how the the results are plotted. To make the x-axis values represent the frequency information in terms of Hertz, you need
frequency = np.arange(N) / N * sampling_rate
and then you can crop the half of it
frequency = frequency[:N//2]
and give it to plt.plot(frequency, y)
. The equation for frequency
above comes from the fact that each DFT coefficient X(k)
for k = 0, .., N-1
has a exp(-j 2pi kn/N)
in it where k/N
gives you the normalized frequency. Multiplying by sampling rate recovers the frequency corresponding to the continous domain.
A sample:
# sample x data
xs = np.linspace(0, 4, 1_000)
# sampling rate in this case
fs = 1 / np.diff(xs)[0]
# sine of it
ys = np.sin(2 * np.pi * 60 * xs)
# taking FFT
dft = np.fft.fft(ys)
# getting x-axis values to represent freq in Hz
N = len(xs)
x_as_freq = np.arange(N) / N * fs
# now plotting it
plt.plot(x_as_freq, np.abs(dft))
plt.xlabel("Frequency (Hz)")
plt.ylabel("DFT magnitude")
# to see that peak is indeed at 60Hz
plt.xticks(np.arange(0, 250, 20))
which gives