When passing arrays to procedures, what is best in terms of (1) speed and (2) memory, assumed-shape or explicit shape? A similar question was asked some time ago in this forum but not in these terms: Passing size as argument VS assuming shape in Fortran procedures
I provide a simple program to show what I mean
! Compile with
! ifort /O3 main.f90 -o run_win.exe
module mymod
USE iso_Fortran_env, ONLY: dp => real64
implicit none
private
public :: dp, sub_trace, sub_trace_es
contains
subroutine sub_trace(mat,trace)
! Assumed shape
implicit none
real(dp), intent(in) :: mat(:,:)
real(dp), intent(out) :: trace
real(dp) :: V(size(mat,dim=1))
integer :: i,N
if (size(mat,dim=1) /= size(mat,dim=2)) then
error stop "Input matrix is not square!"
endif
N = size(mat,dim=1)
do i=1,N
V(i) = mat(i,i)
enddo
trace = sum(V)
end subroutine sub_trace
subroutine sub_trace_es(n,mat,trace)
! Passing array explicit shape
implicit none
integer, intent(in) :: n
real(dp), intent(in) :: mat(n,n)
real(dp), intent(out) :: trace
real(dp) :: V(n)
integer :: i
do i=1,n
V(i) = mat(i,i)
enddo
trace = sum(V)
end subroutine sub_trace_es
end module mymod
program main
use mymod, only: dp, sub_trace,sub_trace_es
implicit none
integer, parameter :: nn=2
real(dp) :: mat(nn,nn)
real(dp), allocatable :: mat4(:,:)
real(dp) :: trace1,trace2,trace3,trace4
write(*,*) "Passing arrays to subroutines:"
write(*,*) "Assumed-shape vs explicit shape"
mat(1,:) = [2_dp,3_dp]
mat(2,:) = [4_dp,5_dp]
call sub_trace(mat,trace1)
write(*,*) "trace1 = ", trace1
call sub_trace_es(nn,mat,trace2)
write(*,*) "trace2 = ", trace2
! First example offered by francescalus:
call sub_trace_es(2,real([1,2,3,4,5,6,7,8,9],dp), trace3)
write(*,*) "trace3 = ", trace3
! Second example
mat4 = reshape(real([1,2,3,4,5,6,7,8,9],dp),[3,3])
call sub_trace(mat4, trace4)
write(*,*) "trace4 = ", trace4
pause
end program
With assumed shape you can achieve passing non-contiguous arrays or their without temporary copies. The receiving subroutine knows where the individual parts are in memory and can jump between them thanks to the dope vector in the array descriptor. That means that you avoid a temporary copy, but the iteration through the array is more complicated and may be slower.
If an assumed shape array has the contiguous
attribute, the compiler can generate simpler and faster code, but if the actual argument is not contiguous, a temporary copy must be made.
For explicit-size arrays, the dummy argument is always contiguous. However, a temporary copy will be necessary if the actual argument is not contiguous.
With assumed shape arrays you get the benefit of better argument checking by the compiler during the compilation, because the explicit interface is always available. Some checking will be possible even for explicit-size arrays if the explicit interface is available and sometimes even when it is not, but the possibilities are more limited.
One reason for that is that thanks or due to the storage association rules it is possible to pass an array with a different rank and with a total size (number of elements) larger (or equal) to the size declared in the shape of the dummy argument of the explicit size array.
For an assumed shape the shape is passed automatically with the array descriptor. Therefore passing a smaller or larger than a declared size is a concept that does not exist for them, they simply work differently.
In many ways these types of array dummy arguments are just way too different and differ in what you can do with them. It is not just one or the other because of speed. They strongly differ in the way they are used. For explicit size arrays you have to provide the size somehow.
This differences can be illustrated by the examples offered by francescalus:
call sub_trace_es(2, real([1,2,3,4,5,6,7,8,9], dp), trace)
this asks for a trace of a 2x2 array. The argument being passed is a 1D array containing 9 elements. However, only the first four will be considered. The matrix the subroutine will look at is
1 3
2 4
(column-major order) and the trace will be 5.
For
call subtrace( reshape(real([1,2,3,4,5,6,7,8,9], dp), [3,3]), trace)
the same 9-element numeric sequence is reshaped into a 3x3 array. Hence the matrix the subroutine will look at is
1 4 7
2 5 8
3 6 9
and the trace will be 15.
I personally use assumed shape with the explicit contiguous
attribute in several places of my production code for supercomputers where large arrays are passed around. However, be careful to enable warnings about temporary copies, it is easy to forget this in one location and than you spoil everything by unnecessary temporaries.
In most parts of my code, that are not so performance-critical, I just use assumed shape without further attributes.