Search code examples
c++assemblyconcurrencyvolatile

Why does adding a volatile qualifier to a variable not prevent instruction reordering?


I have a simple C++ code snippet as shown below:

int A;
int B;

void foo() {
    A = B + 1;
    // asm volatile("" ::: "memory");
    B = 0;
}

When I compile this code, the generated assembly code is reordered as follows:

foo():
        mov     eax, DWORD PTR B[rip]
        mov     DWORD PTR B[rip], 0
        add     eax, 1
        mov     DWORD PTR A[rip], eax
        ret
B:
        .zero   4
A:
        .zero   4

However, when I add a memory fence (commented line in the C++ code), the instructions are not reordered. My understanding is that adding a volatile qualifier to a variable should also prevent instruction reordering. So, I modified the code to add volatile to variable B:

int A;
volatile int B;

void foo() {
    A = B + 1;
    B = 0;
}

To my surprise, the generated assembly code still shows reordered instructions. Can someone explain why the volatile qualifier did not prevent instruction reordering in this case?

Code is available in godbolt


Solution

  • My understanding is that adding a volatile qualifier to a variable should also prevent instruction reordering.

    That's a major oversimplification. Although the C++ standard doesn't define the semantics of volatile very explicitly (saying only that "accesses are evaluated strictly according to the rules of the abstract machine"), the unwritten rule is that volatile objects are treated as if some external entity (e.g. I/O hardware) may be reading and writing them asynchronously, and that both reads and writes are side effects that the external entity can observe. As such, each read/write to a volatile object (of machine word size or less) should result in the execution of exactly one load/store instruction.

    From this it follows that loads and stores to volatile objects will not be reordered with each other. But in your program A is not volatile, so we assume that the external entity does not see it. Therefore it does not matter how the accesses to A are ordered with respect to accesses to B or anything else, and the compiler is free to reorder them. Instructions like add eax, 1 that do not access memory at all are also fair game; the external entity can't see the machine registers either.

    Per your use of the tag, this is one of the many reasons that volatile is not the right approach for variables to be shared between threads - because unlike the "external entity", another thread does have access to your non-volatile variables. In olden times prior to C++11, people used volatile because it was all there was, and you could make it work, with the use of explicit memory barrier functions, if you knew something about the way your compiler did optimizations (which was usually undocumented). Since C++11 we have std::atomic and that is the only right way to handle inter-thread sharing, but unfortunately the association with volatile lingers on in obsolete docs and the minds of old-timers. See Why is volatile not considered useful in multithreaded C or C++ programming? for more.

    Also relevant: Does the C++ volatile keyword introduce a memory fence? (No, it does not, as you have discovered.)