Search code examples
c++undefined-behavior

In practice, why would different compilers compute different values of int x = ++i + ++i;?


Consider this code:

int i = 1;
int x = ++i + ++i;

We have some guesses for what a compiler might do for this code, assuming it compiles.

  1. both ++i return 2, resulting in x=4.
  2. one ++i returns 2 and the other returns 3, resulting in x=5.
  3. both ++i return 3, resulting in x=6.

To me, the second seems most likely. One of the two ++ operators is executed with i = 1, the i is incremented, and the result 2 is returned. Then the second ++ operator is executed with i = 2, the i is incremented, and the result 3 is returned. Then 2 and 3 are added together to give 5.

However, I ran this code in Visual Studio, and the result was 6. I'm trying to understand compilers better, and I'm wondering what could possibly lead to a result of 6. My only guess is that the code could be executed with some "built-in" concurrency. The two ++ operators were called, each incremented i before the other returned, and then they both returned 3. This would contradict my understanding of the call stack, and would need to be explained away.

What (reasonable) things could a C++ compiler do that would lead to a result of 4 or a result or 6?

Note

This example appeared as an example of undefined behavior in Bjarne Stroustrup's Programming: Principles and Practice using C++ (C++ 14).

See cinnamon's comment.


Solution

  • The compiler takes your code, splits it into very simple instructions, and then recombines and arranges them in a way that it thinks optimal.

    The code

    int i = 1;
    int x = ++i + ++i;
    

    consists of the following instructions:

    1. store 1 in i
    2. read i as tmp1
    3. add 1 to tmp1
    4. store tmp1 in i
    5. read i as tmp2
    6. read i as tmp3
    7. add 1 to tmp3
    8. store tmp3 in i
    9. read i as tmp4
    10. add tmp2 and tmp4, as tmp5
    11. store tmp5 in x
    

    But despite this being a numbered list the way I wrote it, there are only a few ordering dependencies here: 1->2->3->4->5->10->11 and 1->6->7->8->9->10->11 must stay in their relative order. Other than that the compiler can freely reorder, and perhaps eliminate redundancy.

    For example, you could order the list like this:

    1. store 1 in i
    2. read i as tmp1
    6. read i as tmp3
    3. add 1 to tmp1
    7. add 1 to tmp3
    4. store tmp1 in i
    8. store tmp3 in i
    5. read i as tmp2
    9. read i as tmp4
    10. add tmp2 and tmp4, as tmp5
    11. store tmp5 in x
    

    Why can the compiler do this? Because there's no sequencing to the side effects of the increment. But now the compiler can simplify: for example, there's a dead store in 4: the value is immediately overwritten. Also, tmp2 and tmp4 are really the same thing.

    1. store 1 in i
    2. read i as tmp1
    6. read i as tmp3
    3. add 1 to tmp1
    7. add 1 to tmp3
    8. store tmp3 in i
    5. read i as tmp2
    10. add tmp2 and tmp2, as tmp5
    11. store tmp5 in x
    

    And now everything to do with tmp1 is dead code: it's never used. And the re-read of i can be eliminated too:

    1. store 1 in i
    6. read i as tmp3
    7. add 1 to tmp3
    8. store tmp3 in i
    10. add tmp3 and tmp3, as tmp5
    11. store tmp5 in x
    

    Look, this code is much shorter. The optimizer is happy. The programmer is not, because i was only incremented once. Oops.

    Let's look at something else the compiler can do instead: let's go back to the original version.

    1. store 1 in i
    2. read i as tmp1
    3. add 1 to tmp1
    4. store tmp1 in i
    5. read i as tmp2
    6. read i as tmp3
    7. add 1 to tmp3
    8. store tmp3 in i
    9. read i as tmp4
    10. add tmp2 and tmp4, as tmp5
    11. store tmp5 in x
    

    The compiler could reorder it like this:

    1. store 1 in i
    2. read i as tmp1
    3. add 1 to tmp1
    4. store tmp1 in i
    6. read i as tmp3
    7. add 1 to tmp3
    8. store tmp3 in i
    5. read i as tmp2
    9. read i as tmp4
    10. add tmp2 and tmp4, as tmp5
    11. store tmp5 in x
    

    and then notice again that i is read twice, so eliminate one of them:

    1. store 1 in i
    2. read i as tmp1
    3. add 1 to tmp1
    4. store tmp1 in i
    6. read i as tmp3
    7. add 1 to tmp3
    8. store tmp3 in i
    5. read i as tmp2
    10. add tmp2 and tmp2, as tmp5
    11. store tmp5 in x
    

    That's nice, but it can go further: it can reuse tmp1:

    1. store 1 in i
    2. read i as tmp1
    3. add 1 to tmp1
    4. store tmp1 in i
    6. read i as tmp1
    7. add 1 to tmp1
    8. store tmp1 in i
    5. read i as tmp2
    10. add tmp2 and tmp2, as tmp5
    11. store tmp5 in x
    

    Then it can eliminate the re-read of i in 6:

    1. store 1 in i
    2. read i as tmp1
    3. add 1 to tmp1
    4. store tmp1 in i
    7. add 1 to tmp1
    8. store tmp1 in i
    5. read i as tmp2
    10. add tmp2 and tmp2, as tmp5
    11. store tmp5 in x
    

    Now 4 is a dead store:

    1. store 1 in i
    2. read i as tmp1
    3. add 1 to tmp1
    7. add 1 to tmp1
    8. store tmp1 in i
    5. read i as tmp2
    10. add tmp2 and tmp2, as tmp5
    11. store tmp5 in x
    

    and now 3 and 7 can be merged into one instruction:

    1. store 1 in i
    2. read i as tmp1
    3+7. add 2 to tmp1
    8. store tmp1 in i
    5. read i as tmp2
    10. add tmp2 and tmp2, as tmp5
    11. store tmp5 in x
    

    Eliminate the last temporary:

    1. store 1 in i
    2. read i as tmp1
    3+7. add 2 to tmp1
    8. store tmp1 in i
    10. add tmp1 and tmp1, as tmp5
    11. store tmp5 in x
    

    And now you get the result that Visual C++ is giving you.

    Note that in both optimization paths, the important order dependencies were preserved, insofar as the instructions weren't removed for doing nothing.