Search code examples
copenmphpc

OpenMP parallel for slow down my code (C language)


I'm trying to use openMP to speed up a the parallel version of list ranking. My implementation is as follows:

int ListRankingParallel(int *R1,int *S, int N)
{
int i;
int *Q = (int*)malloc(N * sizeof(int));

#pragma omp parallel for private(i)
for (i=0; i<N; i++){

    if( S[i] != -1)R1[i] = 1;
    else R1[i] = 0;
    Q[i] = S[i];

}

#pragma omp parallel for private(i)
for(i=0; i<N; i++)
    while (Q[i] != -1 & Q[Q[i]] != -1) {
        R1[i] = R1[i] + R1[Q[i]];
        Q[i] = Q[Q[i]];
    }

free(Q);

return *R1;
}

The serial version of my list ranking is

int ListRankingSerial(int *R2,int *S, int N)
{
int temp;  
int j,i;
for( i=0; i<N; i++){
    j = 0;
    temp = S[i];
    while(S[i]!=-1)
    {
        j++;
        S[i] = S[S[i]];
    }
    R2[i] = j;
    S[i] = temp;
}

return *R2;
}

When I run them repectively, using

get_walltime(&S1);
ListRankingParallel(R1,S,N);
get_walltime(&E1);

get_walltime(&S3);
ListRankingSerial(R3,S,N);
get_walltime(&E3);

If I run my code on my Mac, the parallel version runs significantly faster than the serial version. However, if I run it on another linux cluster, the parallel version is twice slower than the serial version.

On my mac, I compile my code using

gcc-7 -fopenmp <file name>.c 

On the cluster, using

gcc -fopenmp <file name>.c 

If you want to test my code, please use:

int main(){

int N = 1e+5;
int *S = (int*)malloc(N * sizeof(int));
int *R1 = (int*)malloc(N * sizeof(int));
int *R3 = (int*)malloc(N * sizeof(int));
double S1,S2,S3,E1,E2,E3;
int i;

for( i = 0; i < N; i++)
    S[i] = i+1;

S[N-1] = -1;

get_walltime(&S1);
ListRankingParallel(R1,S,N);
get_walltime(&E1);
printf("%f\n",E1-S1);

get_walltime(&S3);
ListRankingSerial(R3,S,N);
get_walltime(&E3);
printf("%f\n",E3-S3);

}

Can anyone please give me some advice? Thank you!


Solution

  • Are you certain it is running on multiple threads?

    You should either be setting the OMP_NUM_THREADS environment variable or calling omp_set_num_threads() at the start of main. You can get the total number of threads available using omp_get_max_threads() and do something like

    max_threads = omp_get_max_threads()
    omp_set_num_threads(max_threads)
    

    See more information about setting the number of threads in this answer.

    Edit: Also you can check how many threads are being used with omp_get_num_threads().