Search code examples
cgccclangside-effectsunspecified-behavior

Is this a C language undefined behaviour? Different results in clang and GCC


I'm getting different results for the same code with different compilers. Is this a undefined behaivour?

#include <stdio.h>
int a;
int b=10;
int puan_ekle(int puan, int bonus){
    puan=puan+bonus;
    a=puan-5;
    bonus--;
    return bonus;
}
int main(){
    a=23;
    printf("Result1 %d \n", a);
    a=a+puan_ekle(a,b);
    printf("Result2 %d \n", a);
    a=a+puan_ekle(a,b);
    printf("Result3 %d \n", a);
}

Solution

  • The behavior is unspecified, not undefined.

    The C standard distinguishes these. C 2018 3.4.4 1 says:

    unspecified behavior

    behavior, that results from the use of an unspecified value, or other behavior upon which this document provides two or more possibilities and imposes no further requirements on which is chosen in any instance

    And 3.4.3 1 says:

    undefined behavior

    behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this document imposes no requirements

    In some situations, when an object is both used for its value and modified, a rule in the C standard makes the behavior undefined. 6.5 2 says:

    If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.

    Let’s see how this applies to a=a+puan_ekle(a,b);. In this expression:

    1. a is modified by the a=.
    2. a is used in the a+.
    3. a is used in the arguments (a,b).
    4. Inside the function puan_ekle, a is modified with a=puan-5;.

    The modifications are side effects—they are something that happens separately from computing the value of the expression. If either of the modifications, 1 or 4, is unsequenced relative to any of the other items, the behavior is undefined.

    Regarding 1, 6.5.16 3 says:

    … The side effect of updating the stored value of the left operand is sequenced after the value computations of the left and right operands…

    So 1 is sequenced after 2 and 3. Since 4 is a side effect, not a value computation, we still have to consider the relationship of 1 and 4. To resolve this, we will consider sequence points. Per 5.1.2.3, “The presence of a sequence point between the evaluation of expressions A and B implies that every value computation and side effect associated with A is sequenced before every value computation and side effect associated with B.”

    Next we need to know what a full expression is and that there is a sequence point after each full expression. 6.8 4 says:

    A full expression is an expression that is not part of another expression, nor part of a declarator or abstract declarator… There is a sequence point between the evaluation of a full expression and the evaluation of the next full expression to be evaluated.

    This means that every statement inside puan_ekle is or contains a full expression: puan=puan+bonus is a full expression, a=puan-5 is a full expression, bonus-- is a full expression, and the bonus in return bonus is a full expression. So, after a=puan-5, there is a sequence point.

    Since, for a=, the side effect of modifying a is sequenced after the value computations of the operands. Evaluating those operands includes calling the function, which includes its sequence points. So effect 4, modifying a in a=puan-5;, must be completed before execution continues to the next statement, and hence must be completed before effect 1. So 1 and 4 are sequenced.

    What is left is to consider effect 4 with respect to 2 and 3. 4 is sequenced after 3 because a function call is sequenced after evaluation of its arguments, per 6.5.2.2 10:

    There is a sequence point after the evaluations of the function designator and the actual arguments but before the actual call…

    Now all we have left is the sequencing of 2 relative to 4. In this, there is no specification of which is first. The evaluations of the operands of + are unsequenced, so, for a+puan_ekle(a,b), a C implementation may do either a first or puan_ekle(a,b) first. However, whichever it does first, there is a sequence point between 2 and 4:

    • If a is evaluated first, then, before the function call, there is a sequence point (per 6.5.2.2 10, quoted above).
    • If puan_ekle(a,b) is evaluated first, there is a sequence point after the full expression a=puan-5.

    Thus, 2 and 4 are indeterminately sequenced. (5.1.2.3 3: “… Evaluations A and B are indeterminately sequenced when A is sequenced either before or after B, but it is unspecified which…”) But they are not unsequenced, so there is no undefined behavior. The behavior is unspecified because there are two possibilities. The C implementation is required implement one of those two possibilities, which is different from undefined behavior, in which there would be no requirements.