I already saw several posts on this site which talk about this issue. However, I think my serious codes where overhead due to creation of threads and all should not be a big issue, have become much slower with open mp now! I am using a quad core machine with gfortran 4.6.3 as my compiler. Below is an example of a test code.
Program test
use omp_lib
integer*8 i,j,k,l
!$omp parallel
!$omp do
do i = 1,20000
do j = 1, 1000
do k = 1, 1000
l = i
enddo
enddo
enddo
!$omp end do nowait
!$omp end parallel
End program test
This code takes around 80 seconds if I run it without open mp, however, with open mp, it takes around 150 seconds. I have seen the same issue with my other serious codes whose runtime is around 5 minutes or so in serial mode. In those codes I am taking care that there are no dependencies from thread to thread. Then why should these codes become slower instead of faster?
Thanks in advance.
You have a race condition, more threads are writing in the same shared l
. Thus the program is invalid, l
should be private
. It also leads to a slowdown because the threads invalidate the cache content the other cores have and the threads have to reload the memory content all the time. Similar thing happens when more threads use the same cache line and it is known as false sharing.
You also probably don't use any compiler optimizations. Enable them by -O2
-O3
, -O5
or -Ofast
. You will see that the program takes 0 seconds because the compiler optimizes everything out.