Search code examples
cmathsignal-processingfftphase

Extracting precise frequencies from FFT Bins using phase change between frames


I have been looking through this fantastic article: http://blogs.zynaptiq.com/bernsee/pitch-shifting-using-the-ft/

While being fantastic, it is extremely hard and heavy going. This material is really stretching me.

I have extracted the maths from Stefan's code module that calculates the exact frequency for a given bin. But I don't understand the last calculation. Can someone explain to me the mathematical construction at the end?

Before digging into the code, let me set the scene:

  • Let's say we set fftFrameSize = 1024, so we are dealing with 512+1 bins

  • As an example, Bin[1]'s ideal frequency fits a single wave in the frame. At a sample rate of 40KHz, tOneFrame = 1024/40K seconds = 1/40s, so Bin[1] would ideally be collecting a 40Hz signal.

  • Setting osamp (overSample) = 4, we progress along our input signal in steps of 256. So the first analysis examines bytes zero through 1023, then 256 through 1279, etc. Note each float gets processed 4 times.

...

void calcBins( 
              long fftFrameSize, 
              long osamp, 
              float sampleRate, 
              float * floats, 
              BIN * bins
              )
{
    /* initialize our static arrays */
    static float gFFTworksp[2*MAX_FRAME_LENGTH];
    static float gLastPhase[MAX_FRAME_LENGTH/2+1];

    static long gInit = 0;
    if (! gInit) 
    {
        memset(gFFTworksp, 0, 2*MAX_FRAME_LENGTH*sizeof(float));
        memset(gLastPhase, 0, (MAX_FRAME_LENGTH/2+1)*sizeof(float));
        gInit = 1;
    }

    /* do windowing and re,im interleave */
    for (long k = 0; k < fftFrameSize; k++) 
    {
        double window = -.5*cos(2.*M_PI*(double)k/(double)fftFrameSize)+.5;
        gFFTworksp[2*k] = floats[k] * window;
        printf("sinValue: %f", gFFTworksp[2*k]);
        gFFTworksp[2*k+1] = 0.;
    }

    /* do transform */
    smbFft(gFFTworksp, fftFrameSize, -1);

    printf("\n");

    /* this is the analysis step */
    for (long k = 0; k <= fftFrameSize/2; k++) 
    {
        /* de-interlace FFT buffer */
        double real = gFFTworksp[2*k];
        double imag = gFFTworksp[2*k+1];

        /* compute magnitude and phase */
        double magn = 2.*sqrt(real*real + imag*imag);
        double phase = atan2(imag,real);

        /* compute phase difference */
        double phaseDiff = phase - gLastPhase[k];
        gLastPhase[k] = phase;

        /* subtract expected phase difference */
        double binPhaseOffset = M_TWOPI * (double)k / (double)osamp;
        double deltaPhase = phaseDiff - binPhaseOffset;

        /* map delta phase into [-Pi, Pi) interval */
        // better, but obfuscatory...
        //    deltaPhase -= M_TWOPI * floor(deltaPhase / M_TWOPI + .5);

        while (deltaPhase >= M_PI)
            deltaPhase -= M_TWOPI;
        while (deltaPhase < -M_PI)
            deltaPhase += M_TWOPI;

(EDIT:) Now the bit I don't get:

        // Get deviation from bin frequency from the +/- Pi interval 
        // Compute the k-th partials' true frequency    

        // Start with bin's ideal frequency
        double bin0Freq = (double)sampleRate / (double)fftFrameSize;
        bins[k].idealFreq = (double)k * bin0Freq;

        // Add deltaFreq
        double sampleTime = 1. / (double)sampleRate;
        double samplesInStep = (double)fftFrameSize / (double)osamp;
        double stepTime = sampleTime * samplesInStep;
        double deltaTime = stepTime;        

        // Definition of frequency is rate of change of phase, i.e. f = dϕ/dt
        // double deltaPhaseUnit = deltaPhase / M_TWOPI; // range [-.5, .5)
        double freqAdjust = (1. / M_TWOPI) * deltaPhase / deltaTime; 

        // Actual freq <-- WHY ???
        bins[k].freq = bins[k].idealFreq + freqAdjust;
    }
}

I just can't see it clearly, even though it seems to be staring in the face. Could someone please explain this process from scratch, step by step?


Solution

  • Finally I have figured this out; really I had to derive it from scratch. I knew there would be some simple way to derive it, my (usual) mistake was to attempt to follow other people's logic rather than use my own common sense.

    This puzzle takes two keys to unlock it.

    ...

    for (int k = 0; k <= fftFrameSize/2; k++) 
    {
        // compute magnitude and phase 
        bins[k].mag = 2.*sqrt(fftBins[k].real*fftBins[k].real + fftBins[k].imag*fftBins[k].imag);
        bins[k].phase = atan2(fftBins[k].imag, fftBins[k].real);
    
        // Compute phase difference Δϕ fo bin[k]
        double deltaPhase;
        {
            double measuredPhaseDiff = bins[k].phase - gLastPhase[k];
            gLastPhase[k] = bins[k].phase;
    
            // Subtract expected phase difference <-- FIRST KEY
            // Think of a single wave in a 1024 float frame, with osamp = 4
            //   if the first sample catches it at phase = 0, the next will 
            //   catch it at pi/2 ie 1/4 * 2pi
            double binPhaseExpectedDiscrepancy = M_TWOPI * (double)k / (double)osamp;
            deltaPhase = measuredPhaseDiff - binPhaseExpectedDiscrepancy;
    
            // Wrap delta phase into [-Pi, Pi) interval 
            deltaPhase -= M_TWOPI * floor(deltaPhase / M_TWOPI + .5);
        }
    
        // say sampleRate = 40K samps/sec, fftFrameSize = 1024 samps in FFT giving bin[0] thru bin[512]
        // then bin[1] holds one whole wave in the frame, ie 44 waves in 1s ie 44Hz ie sampleRate / fftFrameSize
        double bin0Freq = (double)sampleRate / (double)fftFrameSize;
        bins[k].idealFreq = (double)k * bin0Freq;
    
        // Consider Δϕ for bin[k] between hops.
        // write as 2π / m.
        // so after m hops, Δϕ = 2π, ie 1 extra cycle has occurred   <-- SECOND KEY
        double m = M_TWOPI / deltaPhase;
    
        // so, m hops should have bin[k].idealFreq * t_mHops cycles.  plus this extra 1.
        // 
        // bin[k].idealFreq * t_mHops + 1 cycles in t_mHops seconds 
        //   => bins[k].actualFreq = bin[k].idealFreq + 1 / t_mHops
        double tFrame = fftFrameSize / sampleRate;
        double tHop = tFrame / osamp;
        double t_mHops = m * tHop;
    
        bins[k].freq = bins[k].idealFreq + 1. / t_mHops;
    }