Problem with big matrices using fftw3 in Fortran (example code)

this question follows from my last question, but now with the code.

I have problems with the "Fastest Fourier Transform in the West" (link) implemented in Fortran, in particular calculating the inverse of the fft. When I test with small matrices the result is perfect, but from 8x8 on the result is wrong.

Here is my code here. I written it with comments inside. The example matrices are in the files ex1.dat,... ex5.dat, so it is easy to test (I use the intel compiler, I'm not sure that runs with gfortran). Examples ex2 and ex3 works perfect (5x5 and 7x7), but the other examples give wrong results, so I can't understand the error or where looking for.

Inside the code: to verify that all is right I calculate

PRINT*, MINVAL( AA - AA_new ), MAXVAL( AA-AA_new )

where AA is the original matrix, and AA_new is the matrix AA after calculate the fft and then the inverse of the fft. We expect that AA==AA_new, so we expect that MINVAL( AA - AA_new ), MAXVAL( AA-AA_new ) be zero. But for bigger matrices these numbers are big, so AA and AA_new are very different.

Also I'm comparing the result with the command fft2 of matlab, because uses the same library according its documentation.

About the code:

the main file is fft_test.f90,
and all corresponding to the calculus of the fft using fftw3 is in fft_modules.f90. -The definition of _dp (the precision) is in decimal.f90.

You can download the code from the last link, but also I write below:

PROGRAM fftw_test
  !
  USE decimal
  USE fftw_module
  !
  IMPLICIT NONE
  !
  REAL(KIND=dp),    ALLOCATABLE :: AA(:,:), AA_new(:,:)
  COMPLEX(KIND=dp), ALLOCATABLE :: ATF(:,:)
  INTEGER                       :: NN, MM, ii, jj
  !
  OPEN(unit=10,file='ex2.dat',status='old',action='read')
  READ(10,*) NN,MM ! <- NNxMM is the size of the matrix inside the file vel1.dat
  !
  ALLOCATE( AA(NN,MM), AA_new(NN,MM), ATF(NN,MM) )
  AA = 0.0_dp;  AA_new = 0.0_dp;  ATF = 0.0_dp
  !
  DO ii=1,NN
     READ(10,*) ( AA(ii,jj), jj=1,NN ) ! AA is the original matrix (real)
  END DO
  CLOSE(10)
  !
  PRINT*,'MIN and MAX value of AA' !<- just to verify that is not null
  PRINT*, MINVAL( AA ), MAXVAL( AA )
  !
  CALL fft(NN,MM,AA,ATF) ! ATF (complex) is the fft of AA (real)
  !
  CALL ifft(NN,MM,ATF,AA_new) ! AA_new is the inverse fft AA, so must be == AA
  !
  PRINT*,'MIN and MAX value of AA_new' !<- just to verify that is not null
  PRINT*, MINVAL( AA_new ), MAXVAL( AA_new )
  !
  PRINT*,'Max and min val of (AA-AA_new)'!<- just to compare some way the result
  PRINT*, MINVAL( AA - AA_new ), MAXVAL( AA-AA_new )
  PRINT*,' ------------------------------------------------------------------- '
  !
  DEALLOCATE( AA, ATF )
  !
END PROGRAM fftw_test

MODULE fftw_module
  !
  ! Contains the forward and inverse discrete fourier transform of a matrix
  ! using the library fftw3
  !
  USE decimal
  !
CONTAINS
  !
  SUBROUTINE fft(NX,NY,AA,ATF)
    !
    ! Calculate the forward Discrete Fourier transform ATF of the matriz AA
    ! Both matrices have the same size, but AA (the input) is real
    ! and ATF (the output) is complex
    !
    USE, INTRINSIC :: iso_c_binding
    IMPLICIT none
    INCLUDE 'fftw3.f03'
    !   include 'aslfftw3.f03'
    !
    INTEGER(C_INT),            INTENT(in)  :: Nx, Ny
    COMPLEX(C_DOUBLE_COMPLEX), ALLOCATABLE :: zin(:), zout(:)
    TYPE(C_PTR)                            :: planf
    !
    INTEGER                                :: ii, yy, ix, iy
    REAL(KIND=dp)                          :: AA(:,:)
    COMPLEX(KIND=dp)                       :: ATF(:,:)
    !
    ALLOCATE( zin(NX * NY), zout(NX * NY) )
    !
    ! Plan Creation (out-of-place forward and backward FFT)
    planf = fftw_plan_dft_2d(NY, NX, zin, zout, FFTW_FORWARD, FFTW_ESTIMATE)
    IF ( .NOT. C_ASSOCIATED(planf) ) THEN
       PRINT*, "plan creation error!!"
       STOP
    END IF
    !
    zin = CMPLX( RESHAPE( AA, (/ NX*NY /) ) , KIND=dp )
    !
    ! FFT Execution (forward)
    CALL FFTW_EXECUTE_DFT(planf, zin, zout)
    !
    DO iy = 1, Ny
       DO ix = 1, Nx
          ii = ix + nx*(iy-1)
          ATF(ix,iy) = zout(ii)
       END DO
    END DO
    !
    ! Plan Destruction
    CALL FFTW_DESTROY_PLAN(planf)
    CALL FFTW_CLEANUP
    !
    DEALLOCATE( zin, zout )
    !
  END SUBROUTINE fft
  !
  !
  SUBROUTINE ifft(NX,NY,ATF,AA)
    !
    ! Calculate the inverse Discrete Fourier transform ATF of the matriz AA
    ! Both matrices have the same size, but AA ATF (the input) is complex
    ! and AA (the output) is real
    !
    USE, INTRINSIC :: iso_c_binding
    IMPLICIT none
    INCLUDE 'fftw3.f03'
    !   include 'aslfftw3.f03'
    !
    INTEGER(C_INT),            INTENT(in)  :: Nx, Ny
    COMPLEX(C_DOUBLE_COMPLEX), ALLOCATABLE :: zin(:), zout(:)
    TYPE(C_PTR)                            :: planb
    !
    INTEGER                                :: ii, yy, ix, iy
    REAL(KIND=dp)                          :: AA(:,:)
    COMPLEX(KIND=dp)                       :: ATF(:,:)
    !
    ALLOCATE( zin(NX * NY), zout(NX * NY) )
    !
    ! Plan Creation (out-of-place forward and backward FFT)
   planb = fftw_plan_dft_2d(NY, NX, zin, zout, FFTW_BACKWARD, FFTW_ESTIMATE)
    IF ( .NOT. C_ASSOCIATED(planb) ) THEN
       PRINT*, "plan creation error!!"
       STOP
    END IF
    !
    zin = RESHAPE( ATF, (/ NX*NY /) )
    !
    ! FFT Execution (backward)
!    CALL FFTW_EXECUTE_DFT(planb, zout, zin)
    CALL FFTW_EXECUTE_DFT(planb, zin, zout)
    !
    DO iy = 1, Ny
       DO ix = 1, Nx
          ii = ix + nx*(iy-1)
          AA(ix,iy) = dble( zout(ii) )
       END DO
    END DO
    !
    ! Plan Destruction
    CALL FFTW_DESTROY_PLAN(planb)
    CALL FFTW_CLEANUP
    !
    DEALLOCATE( zin, zout )
    !
  END SUBROUTINE ifft
  !
END MODULE fftw_module

MODULE decimal
  !
  ! Precision in the whole program
  !
  ! dp : double precision
  ! sp : simple precision
  !
  IMPLICIT NONE
  !
  INTEGER, PARAMETER ::  dp = KIND(1.D0)
  INTEGER, PARAMETER ::  sp = KIND(1.0)
  !
END MODULE decimal

A matrix example (that works because is small), corresponding on the ex2.dat input file:

5 5
1    2     3    4    5.3
6    7     8    9    10
11   12    3    5    3
0.1 -0.32  0.4  70   12
0 1  0    -1    0    -70

Solution

When you perform a forward and then a back discrete Fourier Transform on some data the normalisation of the result is conventional, usually you either get the data back as it was (to floating point accuracy), or if you are using an unnormalised transform the data will be scaled by the number of points in the data set, or you provide the normalisation as an argument. To find out which you will have read the documentation of whatever software you are using to do the transforms; fftw uses unnormalised transforms . Thus in your code you will need to preform the appropriate scaling. And if you run your code on your datasets you find the scaling is as described - on a 10x10 dataset the final data is 100 times the original data.

I cannot reproduce your claim that the code as given works for the smaller data sets. I get the expected scaling.