Search code examples
c++fftfftwspectrogram

Vectors and matrices in C++ for generating a spectrogram


This is my first attempt to generate a spectrogram of a sinusoidal signal with C++. To generate the spectrogram:

  1. I divided the real sinusoidal signal into B blocks

  2. Applied Hanning window on each block (I assumed there is no overlap). This should give me the inputs for the fft, in[j][k] where k is the block number

  3. Apply fft on in[j][k] for each block and store it.

Here is the script:

#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include <fftw3.h>
#include <iostream>
#include <cmath>
#include <fstream>
using namespace std;

int main(){
    int i;
    int N = 500; // sampled
    int Windowsize = 100;
    double Fs = 200; // sampling frequency
    double T = 1 / Fs; // sample time 
    double f = 50; // frequency
    double *in;
    fftw_complex *out;
    double t[N]; // time vector 
    fftw_plan plan_forward;
    std::vector<double> signal(N);
    int B = N / Windowsize; //number of blocks
    in = (double*)fftw_malloc(sizeof(double) * N);
    out = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * N);

    //Generating the signal
    for(int i = 0; i < = N; i++){
        t[i] = i * T;
        signal[i] = 0.7 * sin(2 * M_PI * f * t[i]);// generate sine waveform
    }

    //Applying the Hanning window function on each block B
    for(int k = 0; i <= B; k++){ 
        for(int j = 0; j <= Windowsize; j++){
            double multiplier = 0.5 * (1 - cos(2 * M_PI * j / (N-1))); // Hanning Window
            in[j][k] = multiplier * signal[j];
        }
        plan_forward = fftw_plan_dft_r2c_1d (Windowsize, in, out, FFTW_ESTIMATE );
        fftw_execute(plan_forward);
        v[j][k]=(20 * log(sqrt(out[i][0] * out[i][0] + out[i][1] * out[i][1]))) / N;
    }

    fftw_destroy_plan(plan_forward);
    fftw_free(in);
    fftw_free(out);
    return 0;
    }

So, the question is: What is the correct way to declare in[j][k] and v[j][k] variables.

Update:I have declared my v [j] [k] as a matrix : double v [5][249]; according to this site :http://www.cplusplus.com/doc/tutorial/arrays/ so now my script looks like:

#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include <fftw3.h>
#include <iostream>
#include <cmath>
#include <fstream>
using namespace std;
int main()
{
int i;
double y;
int N=500;//Number of pints acquired inside the window
double Fs=200;//sampling frequency
int windowsize=100;
double dF=Fs/N;
double  T=1/Fs;//sample time 
double f=50;//frequency
double *in;
fftw_complex *out;
double t[N];//time vector 
double tt[5];
double ff[N];
fftw_plan plan_forward;
double  v [5][249];
in = (double*) fftw_malloc(sizeof(double) * N);
out = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * N);
plan_forward = fftw_plan_dft_r2c_1d ( N, in, out, FFTW_ESTIMATE );

for (int i=0; i<= N;i++)
  {
   t[i]=i*T;
   in[i] =0.7 *sin(2*M_PI*f*t[i]);// generate sine waveform
   }

 for (int k=0; k< 5;k++){ 
   for (int i = 0; i<windowsize; i++){
   double multiplier = 0.5 * (1 - cos(2*M_PI*i/(windowsize-1)));//Hanning Window
   in[i] = multiplier * in[i+k*windowsize];
   fftw_execute ( plan_forward );
         for (int i = 0; i<= (N/2); i++)
         {
         v[k][i]=(20*log10(sqrt(out[i][0]*out[i][0]+ out[i][1]*out[i]   [1])));//Here   I  have calculated the y axis of the spectrum in dB
           }
                                    }
                      }

for (int k=0; k< 5;k++)//Center time for each block
      {
       tt[k]=(2*k+1)*T*(windowsize/2);
       }

fstream myfile;
myfile.open("example2.txt",fstream::out);
myfile << "plot '-' using 1:2" << std::endl;
for (int k=0; k< 5;k++){ 
   for (int i = 0; i<= ((N/2)-1); i++)
         { 
        myfile << v[k][i]<< " " << tt[k]<< std::endl;

          }
                         }
myfile.close();
fftw_destroy_plan ( plan_forward );
fftw_free ( in );
fftw_free ( out );
return 0;
  }

I do not get errors anymore but the spectrogram plot is not right.


Solution

  • As indicated in FFTW's documentation, the size of the output (out in your case) when using fftw_plan_dft_r2c_1d is not the same as the size of the input. More specifically for an input of N real samples, the output consists of N/2+1 complex values. You may then allocate out with:

    out = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * (N/2 + 1));
    

    For the spectrogram output you will then similarly have (N/2+1) magnitudes for each of the B blocks, resulting in the 2D array:

    double** v = new double*[B];
    for (int i = 0; i < B; i++){
      v[i] = new double[(N/2+1)];
    }
    

    Also, note that you may reuse the input buffer in for each iteration (filling it with data for a new block). However since you have chosen to compute an N-point FFT and will be storing smaller blocks of Windowsize samples (in this case N=500 and Windowsize=100), make sure to initialize the remaining samples with zeros:

    in = (double*)fftw_malloc(sizeof(double) * N);
    for (int i = 0; i < N; i++){
      in[i] = 0;
    }
    

    Note that in addition to the declaration and allocation of the in and v variables, the code you posted suffers from a few additional issues:

    • When computing the Hanning window, you should divide by the Windowsize-1 not N-1 (since in your case N correspond to the FFT size).
    • You are taking the FFT of the same block of signal over and over again since you are always indexing with j in the [0,Windowsize] range. You would most likely want to add an offset each time you process a different block.
    • Since the FFT size does not change, you only need to create the plan once. At the very least if you are going to create your plan at every iteration, you should similarly destroy it (with fftw_destroy_plan) at every iteration.

    And a few additional points which may require some thoughts:

    • Scaling the log-scaled magnitudes by dividing by N might not do what you think. You are much more likely to want to scale the linear-scale magnitudes (ie. divide the magnitude before taking the logarithm). Note that this will result in a constant offset of the spectrum curve, which for many application is not that significant. If the scaling is important for your application, you may have a look at another answer of mine for more details.
    • The common formula 20*log10(x) typically used to convert linear scale to decibels uses a base-10 logarithm instead of the natural log (base e~2.7182) function which you've used. This would result in a multiplicative scaling (stretching), which may or may not be significant depending on your application.

    To summarize, the following code might be more in line with what you are trying to do:

    // Allocate & initialize buffers
    in = (double*)fftw_malloc(sizeof(double) * N);
    for (int i = 0; i < N; i++){
      in[i] = 0;
    }
    out = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * (N/2 + 1));
    v = new (double*)[B];
    for (int i = 0; i < B; i++){
      v[i] = new double[(N/2+1)];
    }
    
    // Generate the signal
    ...
    
    // Create the plan once
    plan_forward = fftw_plan_dft_r2c_1d (Windowsize, in, out, FFTW_ESTIMATE);
    
    // Applying the Hanning window function on each block B
    for(int k = 0; k < B; k++){ 
        for(int j = 0; j < Windowsize; j++){
            // Hanning Window
            double multiplier = 0.5 * (1 - cos(2 * M_PI * j / (Windowsize-1)));
            in[j] = multiplier * signal[j+k*Windowsize];
        }
        fftw_execute(plan_forward);
        for (int j = 0; j <= N/2; j++){
          // Factor of 2 is to account for the fact that we are only getting half
          // the spectrum (the other half is not return by a R2C plan due to symmetry)
          v[k][j] = 2*(out[j][0] * out[j][0] + out[j][1] * out[j][1])/(N*N);
        }
        // DC component and at Nyquist frequency do not have a corresponding symmetric 
        // value, so should not have been doubled up above. Correct those special cases.
        v[k][0]   *= 0.5;
        v[k][N/2] *= 0.5;
        // Convert to decibels
        for (int j = 0; j <= N/2; j++){
          // 20*log10(sqrt(x)) is equivalent to 10*log10(x)
          // also use some small epsilon (e.g. 1e-5) to avoid taking the log of 0
          v[k][j] = 10 * log10(v[k][j] + epsilon);
        }
    }
    
    // Clean up
    fftw_destroy_plan(plan_forward);
    fftw_free(in);
    fftw_free(out);
    // Delete this last one after you've done something useful with the spectrogram
    for (int i = 0; i < B; i++){
      delete[] v[i];
    }
    delete[] v;