Suppose I've a scrambled vector of consecutive integers 1:n, say {3,6,2,1,4,5}. My problem is to find, for each element, the number of elements to its left that are smaller than itself. So I'd like the program to return {0,1,0,0,3,4} for this example. This is what I've written in Fortran:
subroutine iterrank(n,invec,outvec,tempvec)
implicit none
integer :: n, i, currank
integer, dimension(n) :: invec, outvec, tempvec
tempvec = 0
outvec = 0
do i = 1,n
currank = invec(i)
outvec(i) = tempvec(currank)
tempvec(currank:n) = tempvec(currank:n) + 1
end do
return
end subroutine
It takes a temporary array (vector), and for each digit d the loop comes across, it adds 1 to every element beyond position d in the temporary vector. The next iteration then takes the appropriate element in the temporary vector as the count of elements smaller than itself. My questions are:
1) I believe this is of complexity O(n^2), since there are O(n) writes to the temporary vector in each iteration of the loop. Am I correct?
2) Is there a more efficient way of doing this for large n (say, >100k)?
I believe this would be more efficient, and you could also reduce the temporary integer array to a single byte.
subroutine iterrank(n,invec,outvec,tempvec)
implicit none
integer :: n, i, currank
integer, dimension(n) :: invec, outvec, tempvec
tempvec = 0
!outvec = 0 ! no need to initialize something overwritten below
do i = 1 , n
currank = invec(i)
outvec(i) = sum( tempvec(1:currank) )
tempvec(currank) = 1
end do
end subroutine
The gain is that you are only writing twice per index, however you are reading elements a maximum of n*n
times.
EDIT:
I haven't tried this, but it should do less reads, with a possible overhead of branching. It is possibly faster for extremely large arrays, I would however expect it to be slower for short arrays:
subroutine iterrank(n,invec,outvec,tempvec)
implicit none
integer :: n, i, currank, prevrank
integer, dimension(n) :: invec, outvec, tempvec
tempvec = 0
outvec(1) = 0
tempvec(invec(1)) = 1
do i = 2 , n
prevrank = invec(i-1)
currank = invec(i)
if ( abs(prevrank-currank) > currank ) then
outvec(i) = sum( tempvec(1:currank) )
else if ( prevrank < currank ) then
outvec(i) = outvec(i-1) + sum( tempvec(prevrank:currank) )
else
outvec(i) = outvec(i-1) - sum( tempvec(currank:prevrank-1) )
end if
tempvec(currank) = 1
end do
end subroutine iterrank