Search code examples
javarecordbytecodejava-17

java record compact constructor bytecode


I was transforming a simple class to a record, which looks like this:

public class Mine {
    private final String name;
    public Mine(String name) {
        this.name = name == null ? "a" : name;
    }
}

Instinctively, I wrote it like this:

public record Mine(String name) {
    public Mine {
        this.name = name == null ? "a" : name;
    }
}

which fails to compile: cannot assign a value to final variable name.

I was a little confused, because this compact constructor works:

    public Mine {
        name = name == null ? "a" : name;
    }

I could not really understand what is going on, so I decided to look at the bytecode:

         0: aload_0
         1: invokespecial #1                  // Method java/lang/Record."<init>":()V
         4: aload_1
         5: ifnonnull     13
         8: ldc           #7                  // String a
        10: goto          14
        13: aload_1
        14: astore_1
        15: aload_0
        16: aload_1
        17: putfield      #9                  // Field name:Ljava/lang/String;
        20: return

Seems like javac, when it sees an assignment to a record variable (name) will actually "save" that to a local variable:

astore_1

and then it does this.name=<local>, something like this:

public Mine {
   String local = name == null ? "a" : name;
   this.name = local;
}

If you look at the bytecode of an equivalent class:

public class Mine {
    private final String name;

    public Mine(String name) {
        String local = name == null ? "a" : name;
        this.name = local;
    }
}

it is almost the same, with the difference that aload_2 is used instead of aload_1, which is not a big deal, and most probably has to do with compatibility reasons.

Can someone confirm if my understanding is correct?


Solution

    • A compact record constructor has implicit parameters corresponding to each record component, matching the declaration of the record. (You write public Mine {}, and it acts like public Mine(String name) {}, because your record was defined as record Mine(String name).
    • These are parameters, when you refer to name in your Mine constructor, it refers to that parameter. They are not final.
    • At the very end of your constructor, all parameters (that is, the value they have at that point; you can change them, given that they aren't final) are written into the fields, which are final and cannot be made non-final. It's as if the compiler adds this.name = name; for you at the end and just before every return;. You can't ask the compiler to skip this step.
    • Given that your fields are auto-assigned at the end, and are always final, you cannot assign them anywhere in your actual code. After all, if you do that, then you assigned them, and later on the automatically generated this.name = name, also assigns them, and that's illegal java. Therefore, this.anyField = is an instant error in a record constuctor, you can never write this.
    • Given that the fields are assigned at the end, reading them before then is invalid, as they haven't been set yet. Therefore, just like the previous bullet said that assigning them (this.field =...) is necessarily an error, so is reading them. Conclusion: this.field is wrong in any record constructor. The compiler knows what you mean when you write it (i.e. syntactically it is valid java; the compiler understands what it means), but will always emit an error when you do so (i.e. it is semantically invalid).
    • Thus, life's simple: Just use the 'field name' (here, name), without this, and everything just works. In fact, given that it is a hidden param, you can't even shadow it out: String name = "haha shadowed out!"; inside your record constructor wouldn't be legal either, for the same reason void testMethod(String x) { String x = ""; } doesn't compile. You can't re-declare a variable in the same scope with the same name.

    You see this 'has hidden parameters and writes them out to the fields at the end' in the bytecode. Specifically, the last part:

    15: aload_0
    16: aload_1
    17: putfield      #9   
    

    is bytecode-ese for this.name = param1. slot 0 is generally always used for the this ref, and slot 1 is here used for that parameter. The operation is 'write this value to that field' (that's what putfield does), and to do this job, the stack needs to be: [A] the receiver, and [B] the value to put there. Hence, aload_0 (loads this) and then aload_1 (loads name param).

    That name = name == null ? "a" : name overwrites what the above bytecode ends up loading via aload_1 is in this part:

    4: aload_1
    5: ifnonnull     13
    8: ldc           #7                  // String a
    10: goto          14
    13: aload_1
    14: astore_1
    

    aload_1 is still 'load param name', so, 4 loads it in, 5 does a nullcheck on this and consumes it (bytecode is stack based, so aload_1 pushes the name value onto the stack. ifnonnull pops a value off the stack and hops to item 13 if the value wasn't null and just goes to the instruction if it was.

    If it was null, the next instruction is ldc (which is short for 'load constant'), this pushes constant value "a" onto the stack. It then goes to 14.

    If it wasn't null, we hop to 13: We aload_1 again (we push param value name on the stack), and end up at 14 too, which now STORES this (astore_1).

    SIDENOTE: bytecode wise this seems very inefficient (why aload_1 and then astore_1? Why not skip past both the aload AND the astore with the jump statement?) - but bytecode isn't designed to be emitted as being efficient. Unlike e.g. C compilers javac intentionally does not optimize, it has no optimization layers (no -O3 or similar command line switch the way e.g. gcc does), and must follow spec to the letter.

    The reason is: That sort of optimization is done by java, but is done at runtime, by hotspot. Not by javac.