Search code examples
audiofft

Compute the cepstrum of an audio signal with FFT only


I'm trying to compute the Cepstrum of an audio signal of N=4096 samples with a microcontroller by using a FFT only (inverse FFT not available).

This is my approach so far:

  1. Put the audio samples into the real parts of the complex array which will be the input for the FFT
  2. Set the imaginary parts of the complex array to zero
  3. Compute the FFT
  4. Scale the values by dividing them by N for the 0th bin and N/2th bin. For every other bin divide the value by N/2.
  5. Compute the power spectrum | x[k] |² = X_re[k]² + X_im[k]² for every frequency bin

Now I'm stuck! My questions are:

  1. Do I only have to consider the power spectrum from 0 to N/2 for the cepstrum computation since the power spectrum is symmetric about N/2? That would imply that I only have to compute N/2 + 1 samples for the next FFT (resp. IFFT), right?
  2. Do I have to put the power spectrum values into the real part of the complex input array for the FFT (resp. IFFT) and set the imaginary parts to zero?
  3. If 2. is correct, can I just run a normal FFT to do an IFFT (Method#4 from https://www.dsprelated.com/showarticle/800.php)
  4. Do I have to scale the values again by dividing by N after that?

Thank you very much!


Solution

  • Answers to the Questions:

    1. Yes, use the bins from 0 to (N/2)-1. Do not use the bin N/2 because the FFT needs an even number of input data.
    2. Yes, put the log10(| x[k] |²) in the real part of the complex number and set the imaginary part to zero.
    3. Yes, just use the normal FFT to compute the IFFT with N/2 points (This is valid because the imaginary part is zero anyways, you can skip the steps to build the complex conjugate).
    4. Yes scale the values in the same way as described above. But since the amount of data points changed, scale by N/2 (0th and N/4th bin) and N/4 (every other bin).

    Now your results are from the 0th to the N/4th bin.

    Best regards Tom