Search code examples
carmvolatilearmcc

arm compiler 5 do not fully respect volatile qualifier


Consider the following code:

volatile int status;

status = process_package_header(&pack_header, PACK_INFO_CONST);

if ((((status) == (SUCCESS_CONST)) ? ((random_delay() && ((SUCCESS_CONST) == (status))) ? 0 : side_channel_sttack_detected()) : 1))
{
    ...
}

Which generates this machine code (produced with the toolchain's objdump):

  60:   f7ff fffe       bl      0 <process_package_header>
  64:   9000            str     r0, [sp, #0]     /* <- storing to memory as status is volatile */
  66:   42a0            cmp     r0, r4           /* <- where is the load before compare? status is volatile, it could have change between the last store instruction (above line) and now */
  68:   d164            bne.n   134 <func+0x134>
  6a:   f7ff fffe       bl      0 <random_delay>

Now, since status is volatile, it should have been read from memory when the if statement is reached. I would expect to see some load command before comparing it (cmp) to SUCCESS_CONST, regardless the fact that it was assigned with a return value from function process_package_header() and stored in memory, as status is volatile and could have been changed between str instruction and cmp instruction.

Please try to ignore the motivation for the if condition, it's purpose is to try detecting a physical attack on the CPU in which condition flags and registers can be altered externally by a pysical equipment.

Toolchain ARM DS-5_v5.27.0 arm compiler: ARMCompiler5.06u5 (armcc)

Target is ARM CortexM0+ CPU


Solution

  • The main rule governing volatile objects is this, from C11 6.7.3/7:

    any expression referring to such an object shall be evaluated strictly according to the rules of the abstract machine, as described in 5.1.2.3. Furthermore, at every sequence point the value last stored in the object shall agree with that prescribed by the abstract machine, except as modified by the unknown factors mentioned previously.

    And it goes on to say that

    What constitutes an access to an object that has volatile-qualified type is implementation-defined.

    , which applies to how other rules (e.g. in 5.1.2.3) are to be interpreted. Your compiler's Users' Guide discusses the details of volatile accesses, but there doesn't seem to be anything surprising there. Section 5.1.2.3 itself mainly talks about sequencing rules; the rules for evaluating expressions are elsewhere (but must still be followed as given with regard to accesses to your volatile object).

    Here are the relevant details of the behavior of the abstract machine:

    1. the assignment operation has a side effect of storing a value in the object identified by status. There is a sequence point at the end of that statement, so

      • the side effect is applied before any evaluations appearing in subsequent statements are performed, and
      • because status is volatile, the assignment expressed by that line is the last write to status performed by the program before the sequence point.
    2. the conditional expression in the if statement is evaluated next, with

      • the sub-expression (status) == (SUCCESS_CONST) being evaluated first, before any of the other sub-expressions.
      • Evaluation of status happens before evaluation of the == operation, and
      • takes the form of converting that identifier to the value stored in the object it identifies (lvalue conversion, per paragraph 6.3.2.1/2).
      • In order to do anything with the value stored in status at that time, that value must first be read.

    The standard does not require a volatile object to reside in addressable storage, so in principle, your volatile automatic variable could be assigned exclusively to a register. In that event, as long as machine instructions using that object either read its value directly from its register or make updates directly to its register, no separate loads or stores would be required to achieve proper volatile semantics. Your particular object does not appear to fall into this category, however, because the store instruction in your generated assembly seems to indicate that it is, indeed, associated with a location in memory.

    Moreover, if the program correctly implemented volatile semantics for an object assigned to a register, then that register would have to be r0. I'm not familiar with the specifics of this assembly language and the processor on which the code runs, but it certainly does not look like r0 is a viable locus for such storage.

    With that being the case I agree that status should have been read back from memory, and it should be read back from memory again if its second appearance in the conditional expression needs to be evaluated. This is the behavior of the abstract machine, which conforming implementations exhibit with respect to all volatile accesses. My analysis, then, is that your implementation is non-conforming in this regard, and I would be inclined to report that as a bug.

    As for a workaround, I think your best bet as to write the important bits in assembly -- inline assembly if your implementation supports that, or as a complete function implemented in assembly if necessary.