Relation between bytecode instructions and processor operations

Java specification guarantees primitive variable assignments are always atomic (expect for long and double types.

On the contrary, Fetch-and-Add operation corresponding to the famous i++ increment operation, would be non-atomic because leading to a read-modify-write operation.

Assuming this code:

public void assign(int b) {
    int a = b;
}

The generated bytecode is:

public void assign(int);
    Code:
       0: iload_1       
       1: istore_2      
       2: return

Thus, we see the assignment is composed of two steps (loading and storing).

Assuming this code:

public void assign(int b) {
        int i = b++;
}

Bytecode:

public void assign(int);
    Code:
       0: iload_1       
       1: iinc          1, 1    //extra step here regarding the previous sample
       4: istore_2      
       5: return

Knowing that X86 processor can (at least modern ones), operates increment operation atomically, as said:

In computer science, the fetch-and-add CPU instruction is a special instruction that atomically modifies the contents of a memory location. It is used to implement mutual exclusion and concurrent algorithms in multiprocessor systems, a generalization of semaphores.

Thus, first question: Despite of the fact that bytecode requires both steps (loading and storage), does Java rely on the fact that assignment operation is an operation always carried out atomically whatever the processor's architecture and so can ensure permanent atomicity (for primitive assignments) in its specification?

Second question: Is it wrong to confirm that with very modern X86 processor and without sharing compiled code across different architectures, there's no need at all to synchronize the i++ operation (or AtomicInteger)? Considering it already atomic.

Solution

Considering the Second question.

You imply that i++ will translate into the X86 Fetch-And-Add instruction which is not true. If the code is compiled and optimized by the JVM it may be true (would have to check the source code of JVM to confirm that), but that code can also run in interpreted mode, where the fetch and add are seperated and not synchronized.

Out of curiosity I checked what assembly code is generated for this Java code:

public class Main {
    volatile int a;

  static public final void main (String[] args) throws Exception {
    new Main ().run ();
  }

  private void run () {
      for (int i = 0; i < 1000000; i++) {
        increase ();
      }  
  } 

  private void increase () {
    a++;
  }
}

I used Java HotSpot(TM) Server VM (17.0-b12-fastdebug) for windows-x86 JRE (1.6.0_20-ea-fastdebug-b02), built on Apr 1 2010 03:25:33 version of JVM (this one I had somewhere on my drive).

These is the crucial output of running it (java -server -XX:+PrintAssembly -cp . Main):

At first it is compiled into this:

00c     PUSHL  EBP
    SUB    ESP,8    # Create frame
013     MOV    EBX,[ECX + #8]   # int ! Field  VolatileMain.a
016     MEMBAR-acquire ! (empty encoding)
016     MEMBAR-release ! (empty encoding)
016     INC    EBX
017     MOV    [ECX + #8],EBX ! Field  VolatileMain.a
01a     MEMBAR-volatile (unnecessary so empty encoding)
01a     LOCK ADDL [ESP + #0], 0 ! membar_volatile
01f     ADD    ESP,8    # Destroy frame
    POPL   EBP
    TEST   PollPage,EAX ! Poll Safepoint

029     RET

Then it is inlined and compiled into this:

0a8   B11: #    B11 B12 &lt;- B10 B11   Loop: B11-B11 inner stride: not constant post of N161 Freq: 0.999997
0a8     MOV    EBX,[ESI]    # int ! Field  VolatileMain.a
0aa     MEMBAR-acquire ! (empty encoding)
0aa     MEMBAR-release ! (empty encoding)
0aa     INC    EDI
0ab     INC    EBX
0ac     MOV    [ESI],EBX ! Field  VolatileMain.a
0ae     MEMBAR-volatile (unnecessary so empty encoding)
0ae     LOCK ADDL [ESP + #0], 0 ! membar_volatile
0b3     CMP    EDI,#1000000
0b9     Jl,s  B11   # Loop end  P=0.500000 C=126282.000000

As you can see it does not use Fetch-And-Add instructions for a++.