Search code examples
pythonnumpymatplotlibsignalsfft

How to plot frequency data from a .wav file in Python?


I'm trying to plot the frequencies that make up the first 1 second of a voice recording.

My approach was to:

  1. Read the .wav file as a numpy array containing time series data
  2. Slice the array from [0:sample_rate-1], given that the sample rate has units of [samples/1 second], which implies that sample_rate [samples/seconds] * 1 [seconds] = sample_rate [samples]
  3. Perform a fast fourier transform (fft) on the time series array in order to get the frequencies that make up that time-series sample.
  4. Plot the the frequencies on the x-axis, and amplitude on the y-axis. The frequency domain would range from 0:(sample_rate/2) since the Nyquist Sampling Theorem tells us that the recording captured frequencies of at least two times the maximum frequency, i.e 2*max(frequency). I'll also slice the frequency output array in half since the output frequency data is symmetrical

Here is my implementation

import matplotlib.pyplot as plt
import numpy as np
from scipy.fftpack import fft
from scipy.io import wavfile

sample_rate, audio_time_series = wavfile.read(audio_path)
single_sample_data = audio_time_series[:sample_rate]

def fft_plot(audio, sample_rate):
  N = len(audio)    # Number of samples
  T = 1/sample_rate # Period
  y_freq = fft(audio)
  domain = len(y_freq) // 2
  x_freq = np.linspace(0, sample_rate//2, N//2)
  plt.plot(x_freq, abs(y_freq[:domain]))
  plt.xlabel("Frequency [Hz]")
  plt.ylabel("Frequency Amplitude |X(t)|")
  return plt.show()

fft_plot(single_sample_data, sample_rate)

This is the plot that it generated

Frequency Plot

However, this is incorrect, my spectrogram tells me I should have frequency peaks below the 5kHz range:

Spectrogram Output

In fact, what this plot is actually showing, is the first second of my time series data:

Time Series Data

Which I was able to debug by removing the absolute value function from y_freq when I plot it, and entering the entire audio signal into my fft_plot function:

...
sample_rate, audio_time_series = wavfile.read(audio_path)
single_sample_data = audio_time_series[:sample_rate]

def fft_plot(audio, sample_rate):
  N = len(audio)    # Number of samples
  y_freq = fft(audio)
  domain = len(y_freq) // 2
  x_freq = np.linspace(0, sample_rate//2, N//2)
  # Changed from abs(y_freq[:domain]) -> y_freq[:domain]
  plt.plot(x_freq, y_freq[:domain])
  plt.xlabel("Frequency [Hz]")
  plt.ylabel("Frequency Amplitude |X(t)|")
  return plt.show()

# Changed from single_sample_data -> audio_time_series
fft_plot(audio_time_series, sample_rate)

The code sample above produced, this plot:

Debug Plot

Therefore, I think one of two things is going on:

  1. The fft() function is not actually performing an fft on the time series data it is being given
  2. The .wav file does not contain time series data to begin with

What could be the issue? Has anyone else experienced this?


Solution

  • I have replicated, essentially replicated, the code in the question and I don't see the problem the OP has described.

    In [172]: %reset -f
         ...: import matplotlib.pyplot as plt
         ...: import numpy as np
         ...: from scipy.fftpack import fft
         ...: from scipy.io import wavfile
         ...: 
         ...: sr, data = wavfile.read('sample.wav')
         ...: print(data.shape, sr)
         ...: signal = data[:sr,0]
         ...: Signal = fft(signal)
         ...: fig, (axt, axf) = plt.subplots(2, 1,
         ...:                                constrained_layout=1,
         ...:                                figsize=(11.8,3))
         ...: axt.plot(signal, lw=0.15) ; axt.grid(1)
         ...: axf.plot(np.abs(Signal[:sr//2]), lw=0.15) ; axf.grid(1)
         ...: plt.show()
      sr, data = wavfile.read('sample.wav')
    (268237, 2) 8000
    

    enter image description here

    Hence, I'm voting for closing the question because it is "Not reproducible or was caused by a typo".