Trapezoidal rule integration using openmp and private clauses

I'm changing a code for serial execution adjusting it to parallel execution (openmp), but I get a bad aproximation of the desired result (pi value). I show both codes below.

Is there something wrong?

program trap
use omp_lib 
implicit none
double precision::suma=0.d0 ! sum is a scalar
double precision:: h,x,lima,limb
integer::n,i, istart, iend, thread_num, total_threads=4, ppt
integer(kind=8):: tic, toc, rate
double precision:: time
double precision, dimension(4):: pi= 0.d0

call system_clock(count_rate = rate)
call system_clock(tic)

lima=0.0d0; limb=1.0d0; suma=0.0d0; n=10000000
h=(limb-lima)/n

suma=h*(f(lima)+f(limb))*0.5d0 !first and last points

ppt= n/total_threads
!$ call omp_set_num_threads(total_threads)

!$omp parallel private (istart, iend, thread_num, i)
  thread_num = omp_get_thread_num()
  !$ istart = thread_num*ppt +1
  !$ iend = min(thread_num*ppt + ppt, n-1)
do i=istart,iend ! this will control the loop in different images
  x=lima+i*h
  suma=suma+f(x) 
  pi(thread_num+1)=suma
enddo
!$omp end parallel

suma=sum(pi) 
suma=suma*h

print *,"The value of pi is= ",suma ! print once from the first image
!print*, 'pi=' , pi
call system_clock(toc)
time = real(toc-tic)/real(rate)
print*, 'Time ', time, 's'

contains

double precision function f(y)
double precision:: y
f=4.0d0/(1.0d0+y*y)
end function f

end program trap

!----------------------------------------------------------------------------------
program trap
implicit none
double precision::sum ! sum is a scalar
double precision:: h,x,lima,limb
integer::n,i
integer(kind=8):: tic, toc, rate
double precision:: time

call system_clock(count_rate = rate)
call system_clock(tic)

lima=0.0d0; limb=1.0d0; sum=0.0d0; n=10000000
h=(limb-lima)/n

sum=h*(f(lima)+f(limb))*0.5d0 !first and last points

do i=1,n-1 ! this will control the loop in different images
  x=lima+i*h
  sum=sum+f(x)
enddo

sum=sum*h

print *,"The value of pi is (serial exe)= ",sum ! print once from the first image

call system_clock(toc)
time = real(toc-tic)/real(rate)
print*, 'Time serial execution', time, 's'

contains

double precision function f(y)
double precision:: y
f=4.0d0/(1.0d0+y*y)
end function f

end program trap

Compiled using:

$ gfortran -fopenmp -Wall -Wextra -O2 -Wall -o prog.exe test.f90 
$ ./prog.exe

and

$ gfortran -Wall -Wextra -O2 -Wall -o prog.exe testserial.f90 
$ ./prog.exe

In serial execution I get good aproximations of pi (3.1415) but using parallel I get (I show several parallel executions):

 The value of pi is=    3.6731101425922810     

 Time    3.3386986702680588E-002 s

-------------------------------------------------------

 The value of pi is=    3.1556004791445953     

 Time    8.3681479096412659E-002 s

------------------------------------------------------

 The value of pi is=    3.2505952856717966     

 Time    5.1473543047904968E-002 s

Solution

There is a problem in your openmp parallel statement. You keep on adding up onto the variable suma. Therefore, you need to specify a reduction statement. Also, you did not specify the variable x to be private.

I also changed some more parts of your code

You explicitly told each thread which index range it should use. Most often the compiler can figure that out more efficiently by itself. I changed parallel to parallel do for that.
It is good practice to set variable attributes in the openmp parallel region to be default(none). You will need to set each variables attribute explicitly.

program trap
  use omp_lib
  implicit none
  double precision   :: suma,h,x,lima,limb, time
  integer            :: n, i
  integer, parameter :: total_threads=5
  integer(kind=8)    :: tic, toc, rate

  call system_clock(count_rate = rate)
  call system_clock(tic)

  lima=0.0d0; limb=1.0d0; suma=0.0d0; n=10000000
  h=(limb-lima)/n

  suma=h*(f(lima)+f(limb))*0.5d0 !first and last points

  call omp_set_num_threads(total_threads)
  !$omp parallel do default(none) private(i, x) shared(lima, h, n)  reduction(+: suma)
  do i = 1, n
    x=lima+i*h
    suma=suma+f(x)
  end do
  !$omp end parallel do

  suma=suma*h

  print *,"The value of pi is= ", suma ! print once from the first image
  call system_clock(toc)
  time = real(toc-tic)/real(rate)
  print*, 'Time ', time, 's'

contains

  double precision function f(y)
    double precision:: y
    f=4.0d0/(1.0d0+y*y)
  end function

end program