Search code examples
javastringjavacstring-concatenationinvokedynamic

How Java internally processes `String` concatenation operator `+` in a chain?


Here is the code:

import java.util.Random;

public class StringConcatCompilerOptimization
{
    private static long compute() {
        var random = new Random();
        var l = random.nextLong();
        return l;
    }
    public static void main(String[] args)
    {
        int i = 1;
        byte b = 120;
        boolean l = true;
        String s = "Hello" + i + compute() + b + l + "world!"; // + concatenation chain
        System.out.println(s);
    }
}

javap output:

Compiled from "StringConcatCompilerOptimization.java"
public class StringConcatCompilerOptimization {
  public StringConcatCompilerOptimization();
    Code:
       0: aload_0
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: return

  public static void main(java.lang.String[]);
    Code:
       0: iconst_1
       1: istore_1
       2: bipush        120
       4: istore_2
       5: iconst_1
       6: istore_3
       7: iload_1
       8: invokestatic  #14                 // Method compute:()J
      11: iload_2
      12: iload_3
      13: invokedynamic #19,  0             // InvokeDynamic #0:makeConcatWithConstants:(IJBZ)Ljava/lang/String;
      18: astore        4
      20: getstatic     #23                 // Field java/lang/System.out:Ljava/io/PrintStream;
      23: aload         4
      25: invokevirtual #29                 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
      28: return
}

Before Java 9 such + chain

String s = "Hello" + i + compute() + b + l + "world!";

was converted to StringBuilder.append().append().append()... chain with corresponding following of .append() in the order of + concatenations. Since Java 9 this String s = "Hello" + i + compute() + b + l + "world!"; replaced with a single invokedynamic call passing arguments to the internals of jvm implementation.

Can you please help me to get the details of the internals - what happens since the invokedynamic called and to the moment we have a concatenated String?
I also wonder whether boxing occurres (to call appropriate toString() methods i.e. Long.toString(), Integer.toString(), ...)?


Solution

  • As you can see from the bytecode, the key is the StringConcatFactory.makeConcatWithConstants method. This method takes in a bunch of arguments and gives back a CallSite.

    public static CallSite makeConcatWithConstants(
        MethodHandles.Lookup lookup,
        String name,
        MethodType concatType,
        String recipe,
        Object... constants
    )
    

    This CallSite represents a method with the signature and return typed as specified by the concatType parameter. When the CallSite is called, it returns a string that is concatenated according to the recipe parameter.

    In your case, concatType has 4 parameters - int, byte, long, boolean. These correspond to the types of the expressions i, compute(), b, l respectively. The recipe parameter will be the string "Hello\u0001\u0001\u0001\u0001world!". The \u0001 represents where the arguments should go.

    What invokedynamic does, is that it calls makeConcatWithConstants, gets a CallSite, and invokes it with the 4 arguments i, compute(), b, l. (Note that the instructions before invokedynamic are pushing these 4 values onto the operand stack.)

    In Java pseudocode,

    CallSite callSite = StringConcatFactory.makeConcatWithConstants(..., ..., ..., "Hello\u0001\u0001\u0001\u0001world!");
    callSite.getTarget().invoke(i, compute(), b, l);
    

    The first 3 arguments of makeConcatWithConstants is "built-in" to the JVM. The recipe parameter is stored under the BootstrapMethods attribute of your class file.

    What this pseudocode doesn't show, is that the JVM will remember the CallSite, so that the next time this particular invokedynamic is executed, the CallSite is reused, and makeConcatWithConstants need not be called again.

    Notably, makeConcatWithConstants can create a CallSite that takes primitive types as the method parameters, so there is no boxing needed. Again, the above pseudocode doesn't show this clearly.

    See also my answer here for another example of invokedynamic.