Search code examples
matplotlibsignal-processingspectrogram

How to change pyplot.specgram x and y axis scaling?


I have never worked with audio signals before and little do I know about signal processing. Nevertheless, I need to represent and audio signal using pyplot.specgram function from matplotlib library. Here is how I do it.

import matplotlib.pyplot as plt
import scipy.io.wavfile as wavfile

rate, frames = wavfile.read("song.wav")
plt.specgram(frames)

The result I am getting is this nice spectrogram below: enter image description here

When I look at x-axis and y-axis which I suppose are frequency and time domains I can't get my head around the fact that frequency is scaled from 0 to 1.0 and time from 0 to 80k. What is the intuition behind it and, what's more important, how to represent it in a human friendly format such that frequency is 0 to 100k and time is in sec?


Solution

    • Firstly, a spectrogram is a representation of the spectral content of a signal as a function of time - this is a frequency-domain representation of the time-domain waveform (e.g. a sine wave, your file "song.wav" or some other arbitrary wave - that is, amplitude as a function of time).

    • The frequency values (y-axis, Hertz) are wholly dependant on the sampling frequency of your waveform ("song.wav") and will range from "0" to "sampling frequency / 2", with the upper limit being the "nyquist frequency" or "folding frequency" (https://en.wikipedia.org/wiki/Aliasing#Folding). The matplotlib specgram function will automatically determine the sampling frequency of the input waveform if it is not otherwise specified, which is defined as 1 / dt, with dt being the time interval between discrete samples of the waveform. You can can pass the option Fs='sampling rate' to the specgram function to manually define what it is. It will be easier for you to get your head around what is going on if you figure out and pass these variables to the specgram function yourself

    • The time values (x-axis, seconds) are purely dependent on the length of your "song.wav". You may notice some whitespace or padding if you use a large window length to calculate each spectra slice (think- the individual spectra which are arranged vertically and tiled horizontally to create the spectrogram image)

    • To make the axes more intuitive in the plot, use x- and y-axes labels and you can also scale the axes values (i.e. change the units) using a method similar to this

    Take home message - try to be a bit more verbose with your code: see below for my example.

        import matplotlib.pyplot as plt
        import numpy as np
    
        # generate a 5Hz sine wave
        fs = 50
        t = np.arange(0, 5, 1.0/fs)
        f0 = 5
        phi = np.pi/2
        A = 1
        x = A * np.sin(2 * np.pi * f0 * t +phi)
    
        nfft = 25
    
        # plot x-t, time-domain, i.e. source waveform
        plt.subplot(211)
        plt.plot(t, x)
        plt.xlabel('time')
        plt.ylabel('amplitude')
    
        # plot power(f)-t, frequency-domain, i.e. spectrogram
        plt.subplot(212)
        # call specgram function, setting Fs (sampling frequency) 
        # and nfft (number of waveform samples, defining a time window, 
        # for which to compute the spectra)
        plt.specgram(x, Fs=fs, NFFT=nfft, noverlap=5, detrend='mean', mode='psd')
        plt.xlabel('time')
        plt.ylabel('frequency')
        plt.show()
    

    5Hz_spectrogram:

    enter image description here