Search code examples
c++cpubarrier

does cpu reorder instructions that are dependent on each other?


I know that CPU can reorder instructions like

load A
load B

but will CPU reorder the following code? (in other words, will a second thread running on another core see the result in reverse order?)

some_array[array_index] = new_value;
++array_index;

I'm guessing it will NEVER because the second line is dependent on the first line. Am I right?


Solution

  • Nope, you're wrong.

    The compiler and CPU are entirely free to optimize and re-order the code so long as the result is the same for any behavior that is guaranteed. That other threads will see the modifications in any particular order is not guaranteed. So, for example, the CPU and compiler are free to implement the code the same as this code:

    ++array_index;
    some_array[array_index - 1] = new_value;
    

    Or even:

    tmp = array_index;
    ++array_index;
    some_array[tmp] = new_value;
    

    Since no guarantees are violated, the compiler and CPU are free to make these optimizations. That's a good thing because they may make the code significantly faster.

    If you need more guarantees, you can use the appropriate tools (locks, barriers, atomics, whatever) to get them. But there's no reason code that doesn't care about this stuff should be denied optimizations like this.

    Here's where you went wrong:

    I'm guessing it will NEVER because the second line is dependent on the first line. Am I right?

    Only if neither the CPU nor the compiler can figure out the dependency and untangle it. But that's trivially obvious here. With the dependency figured out, they can be performed in either order, and compilers and CPUs do in fact figure out these kinds of dependencies (and even more complicated ones besides) because our software would be much slower if they didn't.

    But even if the dependency was irrevocable, that wouldn't require the actual writes to be made visible to other threads in any particular order. They could sit in the CPU's write posting buffers and be executed in any order. And the other thread might re-order the reads too, with the CPU doing read ahead.