I have this code (outlined below) for parallelizing matrix-vector multiplication. But whenever I run it, I discover that it is executing on just one thread (even though I specified 4). How can I separate parts of the parallel code to run on separate threads. Any help will be highly appreciated. Thanks
int nthreads;
nthreads = 4;
omp_set_num_threads(nthreads);
chunk = m/nthreads;
#pragma omp parallel for private(i,j,H) schedule(static,chunk)
for (i=0; i<m; i++ ){
C[i]=0;
for (j=0; j<p; j++) {
int H = omp_get_thread_num();
C[i] += (A[i+(j*m)]*B[j]);
}
}
Did you include this snippet in #pragma omp parallel{...}
or you might be missing the word parallel
?