I am having trouble understanding variable positioning on ASMified Java bytecode. I have the following Javacode:
public class TryCatch {
public static void main(String[] args) {
String test1 = null;
try {
String test2 ="try-inside-begin";
System.out.println("try-outside-begin");
try {
System.out.println(test2);
System.out.println(test1.length());
System.out.println("try-inside-end");
} catch (NullPointerException e) {
test2 = "catch-inside: " + e.getMessage();
throw new Exception(test2, e);
}
System.out.println("try-outside-end");
} catch (Exception e) {
System.out.println("catch-outside: " + e.getMessage());
} finally {
System.out.println("finally");
}
}
}
Which becomes the following bytecode for main
:
TRYCATCHBLOCK L0 L1 L2 java/lang/NullPointerException
TRYCATCHBLOCK L3 L4 L5 java/lang/Exception
TRYCATCHBLOCK L3 L4 L6 null
TRYCATCHBLOCK L5 L7 L6 null
TRYCATCHBLOCK L6 L8 L6 null
L9
LINENUMBER 5 L9
ACONST_NULL
ASTORE 1
L3
LINENUMBER 7 L3
LDC "try-inside-begin"
ASTORE 2
L10
LINENUMBER 8 L10
GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
LDC "try-outside-begin"
INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
L0
LINENUMBER 10 L0
GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
ALOAD 2
INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
L11
LINENUMBER 11 L11
GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
ALOAD 1
INVOKEVIRTUAL java/lang/String.length ()I
INVOKEVIRTUAL java/io/PrintStream.println (I)V
L12
LINENUMBER 12 L12
GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
LDC "try-inside-end"
INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
L1
LINENUMBER 16 L1
GOTO L13
L2
LINENUMBER 13 L2
FRAME FULL [[Ljava/lang/String; java/lang/String java/lang/String] [java/lang/NullPointerException]
ASTORE 3
L14
LINENUMBER 14 L14
NEW java/lang/StringBuilder
DUP
INVOKESPECIAL java/lang/StringBuilder.<init> ()V
LDC "catch-inside: "
INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
ALOAD 3
INVOKEVIRTUAL java/lang/NullPointerException.getMessage ()Ljava/lang/String;
INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
INVOKEVIRTUAL java/lang/StringBuilder.toString ()Ljava/lang/String;
ASTORE 2
L15
LINENUMBER 15 L15
NEW java/lang/Exception
DUP
ALOAD 2
ALOAD 3
INVOKESPECIAL java/lang/Exception.<init> (Ljava/lang/String;Ljava/lang/Throwable;)V
ATHROW
L13
LINENUMBER 17 L13
FRAME SAME
GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
LDC "try-outside-end"
INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
L4
LINENUMBER 21 L4
GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
LDC "finally"
INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
L16
LINENUMBER 22 L16
GOTO L17
L5
LINENUMBER 18 L5
FRAME FULL [[Ljava/lang/String; java/lang/String] [java/lang/Exception]
ASTORE 2
L18
LINENUMBER 19 L18
GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
NEW java/lang/StringBuilder
DUP
INVOKESPECIAL java/lang/StringBuilder.<init> ()V
LDC "catch-outside: "
INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
ALOAD 2
INVOKEVIRTUAL java/lang/Exception.getMessage ()Ljava/lang/String;
INVOKEVIRTUAL java/lang/StringBuilder.append (Ljava/lang/String;)Ljava/lang/StringBuilder;
INVOKEVIRTUAL java/lang/StringBuilder.toString ()Ljava/lang/String;
INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
L7
LINENUMBER 21 L7
GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
LDC "finally"
INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
L19
LINENUMBER 22 L19
GOTO L17
L6
LINENUMBER 21 L6
FRAME SAME1 java/lang/Throwable
ASTORE 4
L8
GETSTATIC java/lang/System.out : Ljava/io/PrintStream;
LDC "finally"
INVOKEVIRTUAL java/io/PrintStream.println (Ljava/lang/String;)V
ALOAD 4
ATHROW
L17
LINENUMBER 23 L17
FRAME SAME
RETURN
MAXSTACK = 4
MAXLOCALS = 5
Notice how near the bottom there is ASTORE 4
/ALOAD 4
. Why is that 4 instead of 3? Since SAME1
frame is "same locals as the previous frame and with a single value on the stack" and the previous frame only has two locals (ref: FRAME FULL [[Ljava/lang/String; java/lang/String] [java/lang/Exception]
).
I have read the spec and but it not clear to me from there either why it is not 3.
The stack frame describes the state of local variables and the operand stack at the point where it appears. Later instructions can of course modify things like normal. As you correctly identified, the stack frame at L6 says that there are two local variables when control flow reaches L6. The following instruction stores to slot 4, which is perfectly legal.
It may help to understand the purpose of the stack map. Originally, there was no stack map at all and the verifier used inference to calculate the local variables at every point in the method. When encountering control flow, it would merge in the values at that point and iterate until convergence.
Unfortunately, this was slow, so in an attempt to speed things up, Oracle added stack maps. This essentially precomputes the verification results at any point where control flow is joined. That way, the verifier can do a single linear pass through the code, because control flow doesn't change results. When the verifier encounters control flow, it checks whether the current state matches the stack frame declared at the jump target, and if not, throws an error. In sections of linear code, there is obviously no need to include stack frames, since the verifier can just do the same thing it did before.
Stack frames are not meant for debugging, they're meant to speed up verification, so they include the minimum information necessary for verification. If the compiler were to hypothetically insert a stack frame at every instruction, then the stack frame after the astore 4
would of course show a new variable in the 4th slot.
As for why it used slot 4 when it could have used slot 3, that's just a whim of the compiler. Perhaps it simplified the implementation of javac, but that's just speculation.