Search code examples
javajvmbytecodeopcodejvm-bytecode

Found an unspecified JVM Bytecode (0xe2) in java class file


I'm recently developing a program which can analyze a java class file. After running the program this was it's output: program output 1 program output 2 program output 3

class test_1 {

    public static String a = "Hello World";

    public static void main(String[] args) {
        int j = 0;
        for(int i = 0;i<10;i++) {
            System.out.println(a);
            j = j + j*j +j/(j+1);
        }
    }
}

I got a bytecode 0xe2 which is not specified in jvm specification 14. What does 0xe2 do??


Solution

  • Your program is outputting every byte as-if they are bytecode instructions, ignoring the fact that many instructions have parameters, so they are multi-byte instructions.

    E.g. your program is incorrectly outputting the constructor as follows:

    2a: aload_0
    b7: invokespecial
    00: nop
    01: aconst_null
    b1: return
    

    If you run javap -c test_1.class, you will however see:

    0: aload_0
    1: invokespecial #1   // Method java/lang/Object."<init>":()V
    4: return
    

    The number before the colon is the offset, not the bytecode. As you can see, offsets 2 and 3 are missing, because the invokespecial instruction uses 2 bytes for parameters, which is documented:

    Format

    invokespecial
    indexbyte1
    indexbyte2
    

    Description

    The unsigned indexbyte1 and indexbyte2 are used to construct an index into the run-time constant pool of the current class (§2.6), where the value of the index is (indexbyte1 << 8) | indexbyte2.

    With the 2 bytes being 00 and 01, index is 1, so the bytecode instruction is as javap showed: invokespecial #1

    If you then look at the constant pool output, you'll see that constant #1 is a methodref to the Object no-arg constructor.

    Your specific question is related to bytecodes a7 ff e2, which is not 3 instructions, but the 3-byte instruction for goto:

    Format

    goto
    branchbyte1
    branchbyte2
    

    Description

    The unsigned bytes branchbyte1 and branchbyte2 are used to construct a signed 16-bit branchoffset, where branchoffset is (branchbyte1 << 8) | branchbyte2.

    Meaning that ff e2 is branchoffset = 0xffe2 = -30, which means that instead of

    a7: goto
    ff: impdep2
    e2: (null)
    

    You program should have printed something like:

    a7 ff e2: goto -30