Search code examples
javaparallel-processingjvmjava-streamjit

How does just-in-time compiler optimizes Java parallel streams?


Some time ago an interesting question had been asked:

Can (a == 1 && a == 2 && a == 3) evaluate to true in Java?

I decided to prove that it is possible using Java 8 Stream API (parallel streams, to be precise). Here is my code that works in very rare cases:

class Race {
    private static int a;

    public static void main(String[] args) {
        IntStream.range(0, 100_000).parallel().forEach(i -> {
            a = 1;
            a = 2;
            a = 3;
            testValue();
        });
    }

    private static void testValue() {
        if (a == 1 && a == 2 && a == 3) {
            System.out.println("Success");
        }
    }
}

And then I thought, maybe it's because of potential JIT compiler optimizations? Therefore, I tried to run the code with the following VM option:

-Djava.compiler=NONE

I disabled the JIT and the number of success cases has increased significantly!

How does just-in-time compiler optimize parallel streams so that the optimization might impact the above code execution?


Solution

  • Streams don't matter. The same effect can be observed with just two simple threads like in this answer.

    When a is not volatile, JIT compiler can optimize (and it actually does!) consecutive assignments.

        a = 1;
        a = 2;
        a = 3;
    

    is transformed to

        a = 3;
    

    Furthermore, JIT compiler also optimizes if (a == 1 && a == 2 && a == 3) to if (false) and then safely removes the entire testValue() call as dead code.

    Let's look into the assembly generated for the lambda.
    To print the compiled code I use -XX:CompileCommand=print,Race::lambda$main$0.

      # {method} {0x000000001e142de0} 'lambda$main$0' '(I)V' in 'Race'
      # parm0:    rdx       = int
      #           [sp+0x20]  (sp of caller)
      0x00000000052eb740: sub     rsp,18h
      0x00000000052eb747: mov     qword ptr [rsp+10h],rbp  ;*synchronization entry
                                                    ; - Race::lambda$main$0@-1 (line 8)
    
      0x00000000052eb74c: mov     r10,76b8940c0h    ;   {oop(a 'java/lang/Class' = 'Race')}
      0x00000000052eb756: mov     dword ptr [r10+68h],3h  ;*putstatic a
                                                    ; - Race::lambda$main$0@9 (line 10)
    
      0x00000000052eb75e: add     rsp,10h
      0x00000000052eb762: pop     rbp
      0x00000000052eb763: test    dword ptr [3470000h],eax
                                                    ;   {poll_return}
      0x00000000052eb769: ret
    

    Besides the method prologue and eplilogue there is just one instruction that stores value 3:

      mov     dword ptr [r10+68h],3h  ;*putstatic a
    

    So, once the method is compiled, System.out.println never happens. Those rare cases when you see "Success", happen during the interpretation, when the code is not yet JIT-compiled.