Search code examples
pythonnumpyscipysignal-processingspectrogram

scipy.signal.spectrogram nfft parameter


What does nfft parameter mean in this function? Please refer to this link for the documentation https://docs.scipy.org/doc/scipy-0.19.0/reference/generated/scipy.signal.spectrogram.html


Solution

  • scipy.signal.spectrogram works by splitting the signal into (partially overlapping) segments of time, and then computing the power spectrum from the Fast Fourier Transform (FFT) of each segment. The length of these segments can be controlled using the nperseg argument, which lets you adjust the trade-off between resolution in the frequency and time domains that arises due to the uncertainty principle. Making nperseg larger gives you more resolution in the frequency domain at the cost of less resolution in the time domain.

    In addition to varying the number of samples that go into each segment, it's also sometimes desirable to apply zero-padding to each segment before taking its FFT. This is what the nfft argument is for:

    nfft : int, optional

    Length of the FFT used, if a zero padded FFT is desired. If None, the FFT length is nperseg. Defaults to None.

    By default, nfft == nperseg, meaning that no zero-padding will be used.

    Why would you want to apply zero-padding?

    • One reason is that this makes the FFT result longer, meaning that you end up with more frequency bins and a spectrogram that looks "smoother" over the frequency dimension. However, note that this doesn't actually give you any more resolution in the frequency domain - it's basically an efficient way of doing sinc interpolation on the FFT result (see here for a more detailed explanation).
    • From a performance perspective it might make sense to pad the segments so that their length is a power of 2, since radix-2 FFTs can be significantly faster than more general methods.