Search code examples
clanguage-lawyervolatilec11

Can volatile variables be read multiple times between sequence points?


I'm making my own C compiler to try to learn as much details as possible about C. I'm now trying to understand exactly how volatile objects work.

What is confusing is that, every read access in the code must strictly be executed (C11, 6.7.3p7):

An object that has volatile-qualified type may be modified in ways unknown to the implementation or have other unknown side effects. Therefore any expression referring to such an object shall be evaluated strictly according to the rules of the abstract machine, as described in 5.1.2.3. Furthermore, at every sequence point the value last stored in the object shall agree with that prescribed by the abstract machine, except as modified by the unknown factors mentioned previously.134) What constitutes an access to an object that has volatile-qualified type is implementation-defined.

Example : in a = volatile_var - volatile_var;, the volatile variable must be read twice and thus the compiler can't optimise to a = 0;

At the same time, the order of evaluation between sequence point is undetermined (C11, 6.5p3):

The grouping of operators and operands is indicated by the syntax. Except as specified later, side effects and value computations of subexpressions are unsequenced.

Example : in b = (c + d) - (e + f) the order in which the additions are evaluated is unspecified as they are unsequenced.

But evaluations of unsequenced objects where this evaluation creates a side effect (with volatile for instance), the behaviour is undefined (C11, 6.5p2):

If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.

Does this mean the expressions like x = volatile_var - (volatile_var + volatile_var) is undefined ? Should my compiler throw an warning if this occurs ?

I've tried to see what CLANG and GCC do. Neither thow an error nor a warning. The outputed asm shows that the variables are NOT read in the execution order, but left to right instead as show in the asm risc-v asm below :

const int volatile thingy = 0;
int main()
{
    int new_thing = thingy - (thingy + thingy);
    return new_thing;
}
main:
        lui     a4,%hi(thingy)
        lw      a0,%lo(thingy)(a4)
        lw      a5,%lo(thingy)(a4)
        lw      a4,%lo(thingy)(a4)
        add     a5,a5,a4
        sub     a0,a0,a5
        ret

Edit: I am not asking "Why do compilers accept it", I am asking "Is it undefined behavior if we strictly follow the C11 standard". The standard seems to state that it is undefined behaviour, but I need more precision about it to correctly interpret that


Solution

  • Reading the (ISO 9899:2018) standard literally, then it is undefined behavior.

    C17 5.1.2.3/2 - definition of side effects:

    Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects

    C17 6.5/2 - sequencing of operands:

    If a side effect on a scalar object is unsequenced relative to either a different side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined. If there are multiple allowable orderings of the subexpressions of an expression, the behavior is undefined if such an unsequenced side effect occurs in any of the orderings.

    Thus when reading the standard literally, volatile_var - volatile_var is definitely undefined behavior. Twice in a row UB actually, since both of the quoted sentences apply.


    Please also note that this text changed quite a bit in C11. Previously C99 said, 6.5/2:

    Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored.

    That is, the behaviour was previously unspecified in C99 (unspecified order of evaluation) but was made undefined by the changes in C11.


    That being said, other than re-ordering the evaluation as it pleases, a compiler doesn't really have any reason to do wild and crazy things with this expression since there isn't much that can be optimized, given volatile.

    As a quality of implementation, mainstream compilers seem to maintain the previous "merely unspecified" behavior from C99.