Search code examples
c++multithreadingparallel-processingopenmpnested-loops

Openmp c++: error: collapsed loops not perfectly nested


I have the following serial code that I would like to make parallel. I understand when using the collapse clause for nested loops, it's important to not have code before and after the for(i) loop since is not allowed. Then how do I parallel a nested for loop with if statements like this:

void foo2D(double Lx, double Ly, double KX[], double KY[], double ksqu[], double ninvksqu[], int FourierMeshType){
    
// Make a variable to account for way wavenumbers are set in FFTW3. 
    
int k_counter = 0;      
    
// kx corresponds to modes [0:n/2-1 , -n/2:-1]. This is why there is an additional step, just due to FFTW3's structuring
    #pragma omp parallel for collapse(2) 
    for(int i = 0; i < nx ; i++){
        for(int j = 0; j < nyk; j++){           
            if (i < nx/2){ // i<nx/2 --> 0 to 127
                KX[j + nyk*i] =2.*M_PI*i/Lx; //K = 2pi/L* [0:nx/2-1 , -n/2:-1]' : creates a column vector with nx/2 elements starting from 0 ( from 0 to 127 then -128 to -1) 256/2 is 128      
            }
            if( i >= nx/2){ // i >= nx/2 --> from -128 to -1
                KX[j + nyk*i] =  2.* M_PI * (-i + 2.*k_counter) / Lx;           
            }
        }
        if( i >= nx/2){ // error here 
            k_counter++;
        }
    }
}

Where nx, ny, nyk defined as:

static const int nx = 256;
static const int ny = 256;
static const int nyk = ny/2 + 1;

Is there a better way to rewrite this?


Solution

  • As pointed out in the comments by 1201ProgramAlarm, you can get rid of the error by eliminating the if branch that exists between the two loops:

    #pragma omp parallel for collapse(2) 
    for(int i = 0; i < nx ; i++){
        for(int j = 0; j < nyk; j++){    
           ...       
        }
        if( i >= nx/2){ <== remove this
            k_counter++;
        }
    }
    

    This :

    if( i >= nx/2){ <== remove this
        k_counter++;
    }
    

    can be statically calculated beforehand. So your code would look like the following:

    void makeFourierMesh2D(double Lx, double Ly, double KX[], double KY[], double ksqu[], double ninvksqu[], int FourierMeshType){
        
        #pragma omp parallel for collapse(2) 
        for(int i = 0; i < nx ; i++){
            for(int j = 0; j < nyk; j++){           
                if (i < nx/2){
                    KX[j + nyk*i] =2.*M_PI*i/Lx; 
                }
                if( i >= nx/2){
                    int k_counter = i - nx/2;
                    KX[j + nyk*i] =  2.* M_PI * (-i + 2.*k_counter) / Lx;           
                }
            }
        }
    }
    

    You can then simplify (-i + 2.*k_counter) to (-i + 2 * ( i - nx/2)) and then (-i + 2*i - nx), which is i - nx.

    To further remove the other loops you can just compute each branch in a separate pair of loops as follows:

        #pragma omp parallel for collapse(2) 
        for(int i = 0; i < nx/2 ; i++){
            for(int j = 0; j < nyk; j++){           
                 KX[j + nyk*i] =2.*M_PI*i/Lx; 
            }
        }
    
        #pragma omp parallel for collapse(2) 
        for(int i = nx/2; i < nx ; i++){
            for(int j = 0; j < nyk; j++){           
                 KX[j + nyk*i] =  2.* M_PI * ((i - nx) / Lx);    
            }
        }
    

    as previously you can also simplify (-i + 2.*k_counter) into i - nx.