Search code examples
javabytecodejava-bytecode-asm

Understanding local var position in JVM bytecode on finally


I am having trouble understanding variable positioning on ASMified Java bytecode. I have the following Javacode:

public class TryCatch {
    public static void main(String[] args) {
        String test1 = null;
        try {
            String test2 ="try-inside-begin";
            System.out.println("try-outside-begin");
            try {
                System.out.println(test2);
                System.out.println(test1.length());
                System.out.println("try-inside-end");
            } catch (NullPointerException e) {
                test2 = "catch-inside: " + e.getMessage();
                throw new Exception(test2, e);
            }
            System.out.println("try-outside-end");
        } catch (Exception e) {
            System.out.println("catch-outside: " + e.getMessage());
        } finally {
            System.out.println("finally");
        }
    }
}

Which becomes the following bytecode for main:

  TRYCATCHBLOCK L0 L1 L2 java/lang/NullPointerException
  TRYCATCHBLOCK L3 L4 L5 java/lang/Exception
  TRYCATCHBLOCK L3 L4 L6 null
  TRYCATCHBLOCK L5 L7 L6 null
  TRYCATCHBLOCK L6 L8 L6 null
 L9
  LINENUMBER 5 L9
  ACONST_NULL
  ASTORE 1
 L3
  LINENUMBER 7 L3
  LDC "try-inside-begin"
  ASTORE 2
 L10
  LINENUMBER 8 L10
  GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
  LDC "try-outside-begin"
  INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
 L0
  LINENUMBER 10 L0
  GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
  ALOAD 2
  INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
 L11
  LINENUMBER 11 L11
  GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
  ALOAD 1
  INVOKEVIRTUAL java/lang/String.length ()I
  INVOKEVIRTUAL java/io/PrintStream.println (I)V
 L12
  LINENUMBER 12 L12
  GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
  LDC "try-inside-end"
  INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
 L1
  LINENUMBER 16 L1
  GOTO L13
 L2
  LINENUMBER 13 L2
 FRAME FULL [[Ljava/lang/String; java/lang/String java/lang/String] [java/lang/NullPointerException]
  ASTORE 3
 L14
  LINENUMBER 14 L14
  NEW java/lang/StringBuilder
  DUP
  INVOKESPECIAL java/lang/StringBuilder.<init> ()V
  LDC "catch-inside: "
  INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
  ALOAD 3
  INVOKEVIRTUAL java/lang/NullPointerException.getMessage ()Ljava/lang/String;
  INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
  INVOKEVIRTUAL java/lang/StringBuilder.toString ()Ljava/lang/String;
  ASTORE 2
 L15
  LINENUMBER 15 L15
  NEW java/lang/Exception
  DUP
  ALOAD 2
  ALOAD 3
  INVOKESPECIAL java/lang/Exception.<init> (Ljava/lang/String;Ljava/lang/Throwable;)V
  ATHROW
 L13
  LINENUMBER 17 L13
 FRAME SAME
  GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
  LDC "try-outside-end"
  INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
 L4
  LINENUMBER 21 L4
  GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
  LDC "finally"
  INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
 L16
  LINENUMBER 22 L16
  GOTO L17
 L5
  LINENUMBER 18 L5
 FRAME FULL [[Ljava/lang/String; java/lang/String] [java/lang/Exception]
  ASTORE 2
 L18
  LINENUMBER 19 L18
  GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
  NEW java/lang/StringBuilder
  DUP
  INVOKESPECIAL java/lang/StringBuilder.<init> ()V
  LDC "catch-outside: "
  INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
  ALOAD 2
  INVOKEVIRTUAL java/lang/Exception.getMessage ()Ljava/lang/String;
  INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
  INVOKEVIRTUAL java/lang/StringBuilder.toString ()Ljava/lang/String;
  INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
 L7
  LINENUMBER 21 L7
  GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
  LDC "finally"
  INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
 L19
  LINENUMBER 22 L19
  GOTO L17
 L6
  LINENUMBER 21 L6
 FRAME SAME1 java/lang/Throwable
  ASTORE 4
 L8
  GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
  LDC "finally"
  INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
  ALOAD 4
  ATHROW
 L17
  LINENUMBER 23 L17
 FRAME SAME
  RETURN
  MAXSTACK = 4
  MAXLOCALS = 5

Notice how near the bottom there is ASTORE 4/ALOAD 4. Why is that 4 instead of 3? Since SAME1 frame is "same locals as the previous frame and with a single value on the stack" and the previous frame only has two locals (ref: FRAME FULL [[Ljava/lang/String; java/lang/String] [java/lang/Exception]).

I have read the spec and but it not clear to me from there either why it is not 3.


Solution

  • The stack frame describes the state of local variables and the operand stack at the point where it appears. Later instructions can of course modify things like normal. As you correctly identified, the stack frame at L6 says that there are two local variables when control flow reaches L6. The following instruction stores to slot 4, which is perfectly legal.

    It may help to understand the purpose of the stack map. Originally, there was no stack map at all and the verifier used inference to calculate the local variables at every point in the method. When encountering control flow, it would merge in the values at that point and iterate until convergence.

    Unfortunately, this was slow, so in an attempt to speed things up, Oracle added stack maps. This essentially precomputes the verification results at any point where control flow is joined. That way, the verifier can do a single linear pass through the code, because control flow doesn't change results. When the verifier encounters control flow, it checks whether the current state matches the stack frame declared at the jump target, and if not, throws an error. In sections of linear code, there is obviously no need to include stack frames, since the verifier can just do the same thing it did before.

    Stack frames are not meant for debugging, they're meant to speed up verification, so they include the minimum information necessary for verification. If the compiler were to hypothetically insert a stack frame at every instruction, then the stack frame after the astore 4 would of course show a new variable in the 4th slot.

    As for why it used slot 4 when it could have used slot 3, that's just a whim of the compiler. Perhaps it simplified the implementation of javac, but that's just speculation.