Search code examples
javastringjvmjavac

Does javac optimize "foo".length()?


If I have this statement/literal in the code:

"foobar".length()

Is it replaced by compiler with 6? Or does it call the method on internalized instance? It's more readable to put something like this in code rather than meaningless 6, but I don't want to invoke the method over and over again, even though I know it's only kind of a getter for the private field.

If yes, is it done by some kind of whitelist for javac, which methods are safe to evaluate during the compile time on which literals (or perhaps all methods that doesn't take any parameters and don't depend on a state from environment (?)) What about possible side effects, etc.


Solution

  • It does not replaced by javac: in Java bytecode you will see the explicit call to the length() method. However during the JIT-compilation it likely will be replaced by constant as JIT-compiler is smart enough to inline the length() method and detect that it returns the length of the final array field of the constant string.

    In general Java bytecode has very few optimizations. The most of hard work is done during the JIT-compilation. Nevertheless you should not worry for the performance in this case: the hot code will be JIT-compiled.

    To prove my answer here's the simple method:

    public static long lengthTest() {
        long sum = 0;
        for(int i=1; i<="abcdef".length(); i++) sum+=i*2;
        return sum;
    }
    

    Here's the bytecode:

     0: lconst_0        
     1: lstore_0        
     2: iconst_1        
     3: istore_2        
     4: iload_2         
     5: ldc             #5   // String abcdef
     7: invokevirtual   #6   // Method java/lang/String.length:()I
    10: if_icmpgt       26   
    13: lload_0         
    14: iload_2         
    15: iconst_2        
    16: imul            
    17: i2l             
    18: ladd            
    19: lstore_0        
    20: iinc            2, 1 
    23: goto            4    
    26: lload_0         
    27: lreturn         
    

    As you can see there's an explicit call to length().

    Here's JIT-compiled code (x64 ASM):

    sub $0x18,%rsp
    mov %rbp,0x10(%rsp)  ;*synchronization entry
                         ; - Test::lengthTest@-1 (line 12)
    mov $0x2a,%eax
    add $0x10,%rsp
    pop %rbp
    test %eax,-0x260e95c(%rip)  # 0x0000000000330000
                                ;   {poll_return}
    retq
    

    As you can see the whole method body was technically replaced to the single constant (mov $0x2a,%eax, 0x2a is 42, which is the actual result of the method). So the JIT-compiler not only inlined the length(), it computed the whole method body into constant!