Search code examples
javabytecode

Is Java Bytecode Sequentially executed by JVM?


I am very new to Java bytecode. From my understanding, when disassembling a JAR file, the result will be bytecode interpreted by the JVM directly (numbers). Each byte or 2 bytes of numbers is associated with a Java method in the actual Java source file. Where can I find a mapping of these?

Moreover, lets say I want to find out if a variable was initialized in a class but then never again used. Could I simply check when it was instantiated, and then deem it never used if it never appears again in the bytecode after its initialization? For this logic to work, JVM would have to execute bytecode sequentially, so that that intialized variable could not jump to another function, etc. Are function boundaries defined unlike in general assembly code (intel, MIPS).

Thanks in advance.


Solution

  • It takes some time to understand the JVM bytecode. To get you started here are two things you need to know:

    • The JVM is a stack machine: when it needs to evaluate an expression it first pushes the expression's inputs into a stack and then evaluation of the expression is essentially popping all the inputs off of the stack and pushing the result back into the top of the stack. this result can, in turn, be used as an input to another expression's evaluation.

    • All parameters and local variables are stored in the local variable array.

    Let's see that in practice. Here is a source code:

    package p1;
    
    public class Movie {
      public void setPrice(int price) {
        this.price = price;
      }
    }
    

    As EJP said, you should run javap -c, to see the bytecode: javap -c bin/p1/Movie.class. This is the output:

    public class p1.Movie {
      public p1.Movie();
        Code:
           0: aload_0       
           1: invokespecial #10   // Method java/lang/Object."<init>":()V
           4: return        
    
      public void setPrice(int);
        Code:
           0: aload_0       
           1: iload_1       
           2: putfield      #18    // Field price:I
           5: return        
    }
    

    Looking at the output you can see that in the bytecode we see the default constructor, and the setPrice method.

    The first instruction, aload_0 takes the value of local variable 0 and pushes it into the stack (complete list of instructions). In non-static method, local variable 0, is always the this parameter so after instruction 0 our stack is

    | this |
    +------+
    

    The next instruction is aload_1 which takes the value of local variable 1 and pushes it into the stack. In our local variable 1 is the method's parameter (price). Our stack now looks as follows:

    | price |
    | this  |
    +-------+
    

    The next instruction putfield #18 is the one doing the assignment this.price = price. This instruction pops two value off of the stack. The first popped value is the fields new value. The second popped value is the pointer to the object holding the field to be assigned to. The name of the field to be assigned is encoded in the instruction (that's why the instruction takes three bytes: it starts at position 2, but the next instruction starts at position 5). The extra value encoded into the instruction is "#18". This is the an index into the constant pool. To see the constant pool you should run: javap -v bin/p1/Movie.class:

    Classfile /home/imaman/workspace/Movie-shop/bin/p1/Movie.class
    ...
    Constant pool:
       #1 = Class              #2             //  p1/Movie
       ...
       #5 = Utf8               price
       #6 = Utf8               I
       ...
       #18 = Fieldref          #1.#19          //  p1/Movie.price:I
       #19 = NameAndType       #5:#6           //  price:I
       ...
    

    So #18 specifies that the field to be assigned is the price field of the p1.Movie class (as you can see #18 makes references to #1, #19 which, in turn reference #5 and #6. The actual name of the assigned to field appears in the constant pool)

    back to our execution of the putfield instruction: having popped two values off the stack the JVM now assigns the first popped value into the price field (indicated by #18) of the this object (second popped value).

    The evaluation stack is now empty.

    The last instruction simply returns.