Search code examples
memory-managementfortrandeclarationallocation

Difference between local allocatable and automatic arrays


I am interested in the difference between alloc_array and automatic_array in the following extract:

subroutine mysub(n)
integer, intent(in)  :: n
integer              :: automatic_array(n)
integer, allocatable :: alloc_array(:)

allocate(alloc_array(n))
...[code]...

I am familiar enough with the basics of allocation (not so much on advanced techniques) to know that allocation allows you to change the size of the array in the middle of the code (as pointed out in this question), but I'm interested in considering the case where you don't need to change the size of the array; they might be passed onto other subroutines for operation, but the only purpose of both variables in the code and any subroutine is to hold the data of an array of dimension n (and maybe change the data, but not the size).

(1) Is there any difference in memory usage? I am not an expert in low level procedures, but I have a very slight knowledge of how they matter and how they can impact on the higher level programming (kind of experience I'm talkng about: once trying to run a big code in fortran I was getting a mistake I didn't understand, sysadmin told me "oh, yeah, you are probably saturating the stack; try adding this line in your running script"; anything that gives me insight into how to consider this things when actually coding and not having to patch them later is welcomed). I've been told by people that it might be dependent on many other things like compiler or architecture, but I interpreted from those responses that they were not completely sure of exactly how this was so. Is it so absolutely dependant on a multitude of factors or is there a default/intended behavior in the coding that may then be over-riden by optional compiling keywords or system preferences?

(2) Would the subroutines have different interface needs? Again, not an expert, but it had happened to me before that because of the way I declare variables of subroutine, I end up having to put the subroutines in a module. I've been given to understand this may vary depending on whether I use things that are special for allocatable variables. I am thinking about the case in which everything I do with the variables can be done both by allocatables and automatics, not intentionally using anything specific of allocatables (other than allocation before usage, that is).

Finally, in case this is of use: the reason I am asking is because we are developing in a group and we have recently noticed different people use those two declarations in different ways and we needed to determine if this is something that can be left to personal preference or if there might be any reasons why it might be a good idea to set a clear criteria (and how to set that criteria). I don't need extremely detailed answers, I am trying to determine if this is something I should be doing research about to be careful on how we use it and in what aspects of it should the research be directed.

Though I would be interested to know of "interesting tricks" than can be done with allocation but are not directly related to the need of having size variability, I am leaving those for a possible future follow-up question and focusing here on the strictly functional differences (meaning: what I am explicitly telling compilers to do with my code). The two items I mentioned are the thing I could come up with due to previous experiences, but any other important one that I am missing and should consider, please do mention them.


Solution

  • For the sake of clarity, I'll briefly mention terminology. The two arrays are both local variables and arrays of rank 1.

    • alloc_array is an allocatable array;
    • automatic_array is an explicit-shape automatic object.

    Being local variables their scope is that of the procedure. Automatic arrays and unsaved allocatable arrays come to an end when execution of the procedure completes (with the allocatable array being deallocated); automatic objects cannot be saved and saved allocatable objects are not deallocated on completion of execution.

    Again, as in the linked question, after the allocation statement both arrays are of size n. These are still two very different things. Of course, the allocatable array can have its allocation status changed and its allocation moved. I'll leave both of those (mostly) out of the scope of this answer. An allocatable array, of course, doesn't have to have these things changed once it's been allocated.

    Memory usage

    What was partly contentious about a previous revision of the question is how ill-defined the concept of memory usage is. Fortran, as a language definition, tells us that both arrays come to be the same size and they'll have the same storage layout, and are both contiguous. Beyond that, much follows terms you'll hear a lot: implementation specific and processor dependent.

    In a comment you expressed interest in ifort. So that I don't wander too far, I'll stick to that one compiler. Other compilers have similar concepts, albeit with different names and options.

    Often, ifort will place automatic objects and array temporaries onto stack. There is a (default) compiler option -no-heap-arrays described as having effect

    The compiler puts automatic arrays and temporary arrays in the stack storage area.

    Using the alternative option -heap-arrays allows one to control that slightly:

    This option puts automatic arrays and arrays created for temporary computations on the heap instead of the stack.

    There is a possibility to control size thresholds for which heap/stack would be chosen (when that is known at compile-time):

    If the compiler cannot determine the size at compile time, it always puts the automatic array on the heap.

    As n isn't a constant, one would expect automatic_array to be on the heap with this option, regardless of the size specified. To determine the size, n, of the array at compile time, the compiler would potentially need to do quite a bit of code analysis, even if it is possible.

    There's probably more to be said, but this answer would be far too long if I tried. One thing to note, however, is that automatic local objects and (post-Fortran 90) allocatable local objects can be expected not to leak memory.

    Interface needs

    There is nothing special about the interface requirements of the subroutine mysub: local variables have no impact on that. Any program unit calling that would be happy with an implicit interface. What you are asking about is how the two local arrays can be used.

    This largely comes down to what uses the two arrays can be put to.

    If the dummy argument of a second procedure has the allocatable attribute then only the allocatable array here can be passed to that procedure. It will also need to have an explicit interface. This is true whether or not the procedure changes the allocation.

    Of course, both arrays could be passed as arguments to a dummy argument without the allocatable attribute and then we don't have different interface requirements.

    Anyway, why would one want to pass an argument to an allocatable dummy when there will be no change in allocation status, etc.? There are good reasons:

    • there may be a code path in the procedure which does have an allocation change (controlled by a switch, say);
    • allocatable dummy arguments also pass bounds;
    • etc.,

    This second one is more obvious if the subroutine had specification

    subroutine mysub(n)
    integer, intent(in)  :: n
    integer              :: automatic_array(2:n+1)
    integer, allocatable :: alloc_array(:)
    
    allocate(alloc_array(2:n+1))
    

    Finally, an automatic object has quite strict conditions on its size. n here is clearly allowed, but things don't have to be much more complicated before allocation is the only plausible way. Depending on how much one wants to play with block constructs.

    Taking also a comment from IanH: if we have a very large n the automatic object is likely to lead to crash-and-burn. With the allocatable, one could use the stat= option to come to some amicable agreement with the compiler run-time.