Search code examples

Using Apple's Accelerate framework, FFT, Hann windowing and Overlapping

I'm trying to setup FFT for a project and really didn't get a clear picture on things... Basically, I am using Audio Units to get the data from the device's microphone. I then want to do FFT on that data. This is what I understand so far: I need to setup a circular buffer for my data. On each filled buffer, I apply a Hann window then do an FFT. However, I still need some help on overlapping. To get more precise results, I understand I need to use this expecially since I am using windowing. However, I can't find anything on this... Here's what I have so far (used for pitch detection):

// Setup -------------  
UInt32 log2N          = 10; // 1024 samples  
UInt32 N              = (1 << log2N);  
FFTSetup FFTSettings  = vDSP_create_fftsetup(log2N, kFFTRadix2);  
FFTData.realp         = (float *) malloc(sizeof(float) * N/2);  
FFTData.imagp         = (float *) malloc(sizeof(float) * N/2);  
float * hannWindow = (float *) malloc(sizeof(float) * N);  

// create an array of floats to represent a hann window  
vDSP_hann_window(hannWindow, N, 0);

// FFT Time ----------  
// Moving data from A to B via hann window  
vDSP_vmul(A, 1, hannWindow, 1, B, 1, N);                                 

// Converting data in B into split complex form  
vDSP_ctoz((COMPLEX *) B, 2, &FFTData, 1, N/2);  

// Doing the FFT  
vDSP_fft_zrip(FFTSettings, &FFTData, 1, log2N, kFFTDirection_Forward);   

// calculating square of magnitude for each value  
vDSP_zvmags(&FFTData, 1, FFTData.realp, 1, N/2);  

// Inverse FFT  
vDSP_fft_zrip(FFTSettings, &FFTData, 1, log2N, kFFTDirection_Inverse);  

// Storing the autocorrelation results in B  
vDSP_ztoc(&FFTData, 1, (COMPLEX *)B, 2, N/2);  

vDSP_Length lastZeroCrosssing;  
vDSP_Length zeroCrossingCount;  
vDSP_nzcros(B, 1, N, &lastZeroCrossing, &zeroCrossingCount, N);  

// Cleanup -----------  

So where and how would I include overlapping? Also, any code snippets would be more then welcome. Thanks


The final goal for this project is to do a fingerprinting of the audio, as close to real-time as possible so I need the results as accurate as possible - thus the overlapping. For this purpose I think I could actually drop all the part from inverse to cleanup.


  • You don't actually need to overlap - typically frames are overlapped to give higher resolution in the time axis, e.g. for plotting spectrograms or for estimating note onset times. You could just get your code working without overlapping for now, as it's less complicated, and then decide whether you need higher resolution on the time axis later.

    If you decide you do want to add overlapping then you will need to save a chunk of the previous buffer (e.g. 50%) and then for each new buffer you will process two complete buffers as follows:

    • process last 50% of old buffer + first 50% of new buffer
    • process 100% of new buffer
    • save last 50% of new buffer for next iteration

    For different overlap percentages a similar logic applies.

    Note that increasing overlap beyond a certain point can become counterproductive as the required processing bandwidth increases greatly with little gain in resolution.