I was writing a code when I realized one line is taking a huge time. Here's a simplified version (the line is indicated by !*)
program main
implicit none
real*8, allocatable :: x(:), y(:), f(:)
real*8 :: one, two, six, alpha, sigma, eps, m, n, r2, r, ff, start, finish, rr
integer*8 :: q, i, j
q = 10000
one = 1.
two = 2.
six = 6.
alpha = 4.
n = 12.
m = 6.
eps = 5.
sigma = 1.
rr = 2.1234567654324556
allocate(x(q), y(q), f(q))
call RANDOM_NUMBER(x)
call RANDOM_NUMBER(y)
f(:) = 0.
call CPU_TIME(start)
do i=1,q
do j=i+1,q
r2 = (x(i)-x(j))**two+(y(i)-y(j))**two
ff = six*alpha*eps*(one/r2*(sigma**m/(r2**(m/two))-two*sigma**n/(r2**(n/two))))
r = -(x(i)-x(j))*ff
f(i) = f(i) + r !*
end do
end do
call CPU_TIME(finish)
print*, finish-start
end program main
the time needed to run this code is approximately 10 seconds, but if you change r
with rr
in the line which is indicated by !*
, the time will be 0.01.
Can anyone explain this? What is the difference between r
and rr
while they are both real*8
?
I am using Windows 8.1, Visual Studio 12 Ultimate, Intel Composer XE 2013 and the -O2
flag.
Converting the comments into an answer...
If you you rr
instead of r
in the marked line, all the computation of that loop are irrelevant and the compiler can optimize them away. My guess is that this results in the "performance increase" you see.
Also, most of the calculations you perform in the loop do not depend on x and y. You can easily pre-compute them. Also, please note that (depending on the intelligence of your compiler), x**2
is faster than x**2.0
.