Search code examples
javajvmjvm-bytecode

Java bytecode: local variables table vs on-stack calculation


Assume that we have a following class:

final class Impl implements Gateway3 {
    private final Sensor sensor1;
    private final Sensor sensor2;
    private final Sensor sensor3;

    private final Alarm alarm;

    public Impl(Sensor sensor1, Sensor sensor2, Sensor sensor3, Alarm alarm) {
        this.sensor1 = sensor1;
        this.sensor2 = sensor2;
        this.sensor3 = sensor3;
        this.alarm = alarm;
    }

    @Override
    public Temperature averageTemp() {
        final Temperature temp1 = sensor1.temperature();
        final Temperature temp2 = sensor2.temperature();
        final Temperature temp3 = sensor3.temperature();

        final Average tempAvg = new Average.Impl(temp1, temp2, temp3);
        final Temperature result = tempAvg.result();
        return result;
    }

    @Override
    public void poll() {
        final Temperature avgTemp = this.averageTemp();
        this.alarm.trigger(avgTemp);
    }

This class widely uses local variables and all of them are final.

If we look at the bytecode generated for, let's say, averageTemp method, we'll see the following bytecode:

   0: aload_0
   1: getfield      #2                  // Field sensor1:Lru/mera/avral/script/bytecode/demo/Sensor;
   4: invokeinterface #6,  1            // InterfaceMethod ru/mera/avral/script/bytecode/demo/Sensor.temperature:()Lru/mera/avral/script/bytecode/demo/Temperature;
   9: astore_1
  10: aload_0
  11: getfield      #3                  // Field sensor2:Lru/mera/avral/script/bytecode/demo/Sensor;
  14: invokeinterface #6,  1            // InterfaceMethod ru/mera/avral/script/bytecode/demo/Sensor.temperature:()Lru/mera/avral/script/bytecode/demo/Temperature;
  19: astore_2
  20: aload_0
  21: getfield      #4                  // Field sensor3:Lru/mera/avral/script/bytecode/demo/Sensor;
  24: invokeinterface #6,  1            // InterfaceMethod ru/mera/avral/script/bytecode/demo/Sensor.temperature:()Lru/mera/avral/script/bytecode/demo/Temperature;
  29: astore_3
  30: new           #7                  // class ru/mera/avral/script/bytecode/demo/Average$Impl
  33: dup
  34: aload_1
  35: aload_2
  36: aload_3
  37: invokespecial #8                  // Method ru/mera/avral/script/bytecode/demo/Average$Impl."<init>":(Lru/mera/avral/script/bytecode/demo/Temperature;Lru/mera/avral/script/bytecode/demo/Temperature;Lru/mera/avral/script/bytecode/demo/Temperature;)V
  40: astore        4
  42: aload         4
  44: invokeinterface #9,  1            // InterfaceMethod ru/mera/avral/script/bytecode/demo/Average.result:()Lru/mera/avral/script/bytecode/demo/Temperature;
  49: astore        5
  51: aload         5
  53: areturn

There are plenty of astore opcodes.

Now, assume that using bytecode generation library, I generated the following bytecode for the same method:

   0: new           #18                 // class ru/mera/avral/script/bytecode/demo/Average$Impl
   3: dup
   4: aload_0
   5: getfield      #20                 // Field sensor1:Lru/mera/avral/script/bytecode/demo/Sensor;
   8: invokeinterface #25,  1           // InterfaceMethod ru/mera/avral/script/bytecode/demo/Sensor.temperature:()Lru/mera/avral/script/bytecode/demo/Temperature;
  13: aload_0
  14: getfield      #27                 // Field sensor2:Lru/mera/avral/script/bytecode/demo/Sensor;
  17: invokeinterface #25,  1           // InterfaceMethod ru/mera/avral/script/bytecode/demo/Sensor.temperature:()Lru/mera/avral/script/bytecode/demo/Temperature;
  22: aload_0
  23: getfield      #29                 // Field sensor3:Lru/mera/avral/script/bytecode/demo/Sensor;
  26: invokeinterface #25,  1           // InterfaceMethod ru/mera/avral/script/bytecode/demo/Sensor.temperature:()Lru/mera/avral/script/bytecode/demo/Temperature;
  31: invokespecial #33                 // Method ru/mera/avral/script/bytecode/demo/Average$Impl."<init>":(Lru/mera/avral/script/bytecode/demo/Temperature;Lru/mera/avral/script/bytecode/demo/Temperature;Lru/mera/avral/script/bytecode/demo/Temperature;)V
  34: invokevirtual #36                 // Method ru/mera/avral/script/bytecode/demo/Average$Impl.result:()Lru/mera/avral/script/bytecode/demo/Temperature;
  37: areturn

Semantically, this new method implementation has the same meaning comparing to the old one - it still takes the temperature value from three sensors, make an average from them and returns it. But instead of putting intermediate values to variables, it does all the calculations on stack. I can rewrite it that way since all my local variables and fields are final.

Now there is a question: if I am doing some bytecode-generation-related magic and follow this "all calculations on stack" approach everywhere (assuming that all my variables and fields are final), what potential pitfalls may I face?

NOTE: I have no intention to rewrite bytecode for existing Java classes in the way I described. The example class is given here just to show the method semantics I want to achieve in my bytecode.


Solution

  • As shown by Andreas’ answer, it’s not unusual to have Java code utilizing the stack for temporary values, like in nested expressions. That’s why the instruction set was created that way, using an operand stack to refer to previously calculated value implicitly. In fact, I’d call your code example with its excessive use of local variables unusual.

    If the input of your byte code producing tool is not Java code, the amount of variables might differ from typical Java code, especially if they are of a declarative nature, so there is no requirement to have all of them directly mapped to local variables in byte code.

    JVMs like HotSpot transfer the code into an SSA form, where all transfer operations between local variables and the operand stack, as well as pure stack manipulations like dup and swap, are eliminated anyway, before applying subsequent optimizations, so your choice of using local variables or not will not have any performance impact.

    It might be worth noting that you usually can’t inspect values on the operand stack in debuggers, so you might consider retaining variables when making a debug build (when the LocalVariableTable is generated, too).

    Some code constructs require local variables. E.g. when you have an exception handler, its entry point will have the operand stack cleared, only containing the reference to the exception, so all values it wants to access have to be materialized as local variables. I don’t know if your input form has loop constructs, if so, you usually will convert them from their declarative form to a conventional loop using a mutable variable under the hood, when necessary. Mind the iinc instruction, which works directly with a local variable…