Search code examples
cc99language-lawyerc11unspecified-behavior

Is (x++, y) + (y++, x) undefined or unspecified, and if unspecified, what can it compute?


The comma sequence operator introduces a sequence point in an expression. I am wondering whether this means that the program below avoids undefined behavior.

int x, y;

int main()
{
  return (x++, y) + (y++, x);
}

If it does avoid undefined behavior, it could still be unspecified, that is, return one of several possible values. I would think that in C99, it can only compute 1, but actually, various versions of GCC compile this program into an executable that returns 2. Clang generates an executable that returns 1, apparently agreeing with my intuition.

Lastly, is this something that changed in C11?


Solution

  • Take the expression:

    (x++, y) + (y++, x)
    

    Evaluate left-to-right:

    x++  // yield 0, schedule increment of x
    ,    // sequence point: x definitely incremented now
    y    // yield 0
    y++  // yield 0, schedule increment of y
    // explode because you just read from y and wrote to y
    // with no intervening sequence point
    

    There's nothing in the standard that forbids this, so the whole thing has undefined behavior.

    Contrast this pseudocode:

    f() { return x++, y; }
    g() { return y++, x; }
    f() + g()
    

    Acoording to C99 (5.1.2.3/2) the calls to f and g themselves count as side effects, and the function call operator contains a sequence point just before it enters a function. This means function executions can't interleave.

    Under the "evaluate things in parallel" model:

    f()  // arbitrarily start with f: sequence point; enter f
    g()  // at the same time, start calling g: sequence point
    

    Since the execution of f counts as a side effect itself, the sequence point in g() suspends execution until f has returned. Thus, no undefined behavior.