Search code examples
javaoptimizationforeachfinalvariable-declaration

foreach: why can't the element variable be declared outside?


A "foreach" in Java is, for example

for (Mouse mouse: mouses) {
    [...]
}

We can't do:

Mouse mouse;
for (mouse: mouses) {
    [...]
}

I quote geeksforgeeks: Since the i variable goes out of scope with each iteration of the loop, it is actually re-declaration each iteration

In this way the variable would be declared only once. I don't know if this could have a very little optimization, but this is what I do in "normal" cycles, in every language.

Also, in this way the last element would be available also outside the cycle. This is for example the default in Python.


As another related question, there's some advantage to do

for (final Mouse mouse: mouses) {
    [...]
}

in terms of speed, or mouse can't simply be reassigned inside the loop?


Solution

  • You wrote:

    I quote geeksforgeeks: “Since the i variable goes out of scope with each iteration of the loop, it is actually re-declaration each iteration”

    This is formally correct but its impact also is only of formal nature. It has been described in the linked article already; the result of this definition is that you are allowed to put the final modifier at the variable.

    This in turn implies that you are allowed to capture the variable’s value in an inner class or lambda expression when it is declared final or when no other assignment happens in the loop body (which makes it effectively final).

    Besides that, it produces the same bytecode for the method. The declarations of local variables, including their name and scope and whether they are final, are merely a compile-time artifact. They may get stored in debug information, but these are not mandatory and must not influence the operations of the virtual machine.

    The stack is organized in frames, memory blocks large enough to provide space for all local variables of a method that may exist at the same time (having overlapping scopes), which are reserved when the method is entered already.

    See The Java® Language Specification, § 15.12.4.5:

    A method m in some class S has been identified as the one to be invoked.

    Now a new activation frame is created, containing the target reference (if any) and the argument values (if any), as well as enough space for the local variables and stack for the method to be invoked and any other bookkeeping information that may be required by the implementation…

    and The Java® Virtual Machine Specification, § 2.6

    A frame is used to store data and partial results, as well as to perform dynamic linking, return values for methods, and dispatch exceptions.

    A new frame is created each time a method is invoked. A frame is destroyed when its method invocation completes, […]

    The sizes of the local variable array and the operand stack are determined at compile-time and are supplied along with the code for the method associated with the frame (§4.7.3). Thus the size of the frame data structure depends only on the implementation of the Java Virtual Machine, and the memory for these structures can be allocated simultaneously on method invocation.

    You might say, but theoretically, a JVM could implement it differently, as long as the observable behavior stays compatible. But as said at the beginning, the bytecode of these constructs does not differ. The are no actions associated with the declaration of a variable nor with it going out of scope, so an implementation can’t perform allocations or deallocations at these points as it doesn’t even know whether and where these points exist.

    When you use the following program:

    class VarScopes {
        public static void forEachLoop(Collection<?> c) {
            for(Object o: c) {
                System.out.println(o);
            }
        }
        public static void iteratorLoop(Collection<?> c) {
            for(Iterator<?> it = c.iterator(); it.hasNext();) {
                Object o = it.next();
                System.out.println(o);
            }
        }
        public static void iteratorLoopExtendedScope(Collection<?> c) {
            Iterator<?> it;
            Object o;
            for(it = c.iterator(); it.hasNext();) {
                o = it.next();
                System.out.println(o);
            }
        }
        public static void main(String[] args) throws IOException, InterruptedException {
            decompile();
        }
        private static void decompile() throws InterruptedException, IOException {
            new ProcessBuilder(
                    Paths.get(System.getProperty("java.home"), "bin", "javap").toString(),
                    "-cp", System.getProperty("java.class.path"),
                    "-c", MethodHandles.lookup().lookupClass().getName())
                    .inheritIO()
                    .start()
                    .waitFor();
        }
        private VarScopes() {}
    }
    

    You will get the following output (or something similar):
    E.g. on repl.it

    Compiled from "VarScopes.java"
    public class VarScopes {
      public static void forEachLoop(java.util.Collection<?>);
        Code:
           0: aload_0
           1: invokeinterface #1,  1            // InterfaceMethod java/util/Collection.iterator:()Ljava/util/Iterator;
           6: astore_1
           7: aload_1
           8: invokeinterface #2,  1            // InterfaceMethod java/util/Iterator.hasNext:()Z
          13: ifeq          33
          16: aload_1
          17: invokeinterface #3,  1            // InterfaceMethod java/util/Iterator.next:()Ljava/lang/Object;
          22: astore_2
          23: getstatic     #4                  // Field java/lang/System.out:Ljava/io/PrintStream;
          26: aload_2
          27: invokevirtual #5                  // Method java/io/PrintStream.println:(Ljava/lang/Object;)V
          30: goto          7
          33: return
    
      public static void iteratorLoop(java.util.Collection<?>);
        Code:
           0: aload_0
           1: invokeinterface #1,  1            // InterfaceMethod java/util/Collection.iterator:()Ljava/util/Iterator;
           6: astore_1
           7: aload_1
           8: invokeinterface #2,  1            // InterfaceMethod java/util/Iterator.hasNext:()Z
          13: ifeq          33
          16: aload_1
          17: invokeinterface #3,  1            // InterfaceMethod java/util/Iterator.next:()Ljava/lang/Object;
          22: astore_2
          23: getstatic     #4                  // Field java/lang/System.out:Ljava/io/PrintStream;
          26: aload_2
          27: invokevirtual #5                  // Method java/io/PrintStream.println:(Ljava/lang/Object;)V
          30: goto          7
          33: return
    
      public static void iteratorLoopExtendedScope(java.util.Collection<?>);
        Code:
           0: aload_0
           1: invokeinterface #1,  1            // InterfaceMethod java/util/Collection.iterator:()Ljava/util/Iterator;
           6: astore_1
           7: aload_1
           8: invokeinterface #2,  1            // InterfaceMethod java/util/Iterator.hasNext:()Z
          13: ifeq          33
          16: aload_1
          17: invokeinterface #3,  1            // InterfaceMethod java/util/Iterator.next:()Ljava/lang/Object;
          22: astore_2
          23: getstatic     #4                  // Field java/lang/System.out:Ljava/io/PrintStream;
          26: aload_2
          27: invokevirtual #5                  // Method java/io/PrintStream.println:(Ljava/lang/Object;)V
          30: goto          7
          33: return
    …
    

    In other words, identical bytecode for all variants.

    Also, in this way the last element would be available also outside the cycle. This is for example the default in Python.

    That would be the an actual difference. Whether this would be an improvement is debatable. Since the variable would not be initialized when the collection is empty, we can’t use the variable o after the loop in the iteratorLoopExtendedScope example above. We would need an initialization before the loop to guaranty that the variable is definitely assigned in every case, but then, the code would do more than the standard for-each loop, not less…