Is fftw output depending on size of input?

In the last week i have been programming some 2-dimensional convolutions with FFTW, by passing to the frequency domain both signals, multiplying, and then coming back.

Surprisingly, I am getting the correct result only when input size is less than a fixed number!

I am posting some working code, in which i take simple initial constant matrixes of value 2 for the input, and 1 for the filter on the spatial domain. This way, the result of convolving them should be a matrix of the average of the first matrix values, i.e., 2, since it is constant. This is the output when I vary the sizes of width and height from 0 to h=215, w=215 respectively; If I set h=216, w=216, or greater, then the output gets corrupted!! I would really appreciate some clues about where could I be making some mistake. Thank you very much!

#include <fftw3.h>

int main(int argc, char* argv[]) {

int h=215, w=215;

//Input and 1 filter are declared and initialized here
float *in = (float*) fftwf_malloc(sizeof(float)*w*h);
float *identity = (float*) fftwf_malloc(sizeof(float)*w*h);
for(int i=0;i<w*h;i++){
        in[i]=5;
        identity[i]=1;
    }

//Declare two forward plans and one backward    
fftwf_plan plan1, plan2, plan3;

//Allocate for complex output of both transforms
fftwf_complex *inTrans = (fftwf_complex*) fftw_malloc(sizeof(fftwf_complex)*h*(w/2+1));
fftwf_complex *identityTrans = (fftwf_complex*) fftw_malloc(sizeof(fftwf_complex)*h*(w/2+1));

//Initialize forward plans
plan1 = fftwf_plan_dft_r2c_2d(h, w, in, inTrans, FFTW_ESTIMATE);
plan2 = fftwf_plan_dft_r2c_2d(h, w, identity, identityTrans, FFTW_ESTIMATE);

//Execute them
fftwf_execute(plan1);
fftwf_execute(plan2);

//Multiply in frequency domain. Theoretically, no need to multiply imaginary parts; since signals are real and symmetric
//their transform are also real, identityTrans[i][i] = 0, but i leave here this for more generic implementation.

for(int i=0; i<(w/2+1)*h; i++){
    inTrans[i][0] = inTrans[i][0]*identityTrans[i][0] - inTrans[i][1]*identityTrans[i][1];
    inTrans[i][1] = inTrans[i][0]*identityTrans[i][1] + inTrans[i][1]*identityTrans[i][0];
}
//Execute inverse transform, store result in identity, where identity filter lied.
plan3 = fftwf_plan_dft_c2r_2d(h, w, inTrans, identity, FFTW_ESTIMATE);
fftwf_execute(plan3);

//Output first results of convolution(in, identity) to see if they are the average of in.
for(int i=0;i<h/h+4;i++){
    for(int j=0;j<w/w+4;j++){
        std::cout<<"After convolution, component (" <<  i  <<","<< j << ") is " << identity[j+i*w]/(w*h*w*h) << endl;
    }
}std::cout<<endl;

//Compute average of data
float sum=0.0;
for(int i=0; i<w*h;i++)
    sum+=in[i];

std::cout<<"Mean of input was " <<  (float)sum/(w*h)  << endl;
std::cout<< endl;

fftwf_destroy_plan(plan1);
fftwf_destroy_plan(plan2);
fftwf_destroy_plan(plan3);


return 0;
}

Solution

Your problem has nothing to do with fftw ! It comes from this line :

std::cout<<"After convolution, component (" <<  i  <<","<< j << ") is " << identity[j+i*w]/(w*h*w*h) << endl;

if w=216 and h=216 then `w*h*w*h=2 176 782 336. The higher limit for signed 32bit integer is 2 147 483 647. You are facing an overflow...

Solution is to cast the denominator to float.

std::cout<<"After convolution, component (" <<  i  <<","<< j << ") is " << identity[j+i*w]/(((float)w)*h*w*h) << endl;

The next trouble that you are going to face is this one :

 float sum=0.0;
 for(int i=0; i<w*h;i++)
      sum+=in[i];

Remember that a float has 7 useful decimal digits. If w=h=4000, the computed average will be lower than the real one. Use a double or write two loops and sum on the inner loop (localsum) before summing the outer loop (sum+=localsum) !

Bye,

Francis