Search code examples
cassemblyarmflags

How can I make a compiler choose flag-updating ARM instructions?


I am trying to make use of the CPSR flags when my code does arithmetic operations rather than using a series of if statements to check for overflow, carry, etc. in order to have smaller, faster code. A simple example is this addition operation:

int16_t a = 0x5000;
int16_t b = 0x4000;
int16_t result = a+b;
uint32_t flags = getFlags();

The code will need to run on various platforms, so getFlags() is the only part of the code that will be allowed to contain architecture-specific assembly.

inline uint32_t getFlags() {
    uint32_t flags = 0;
    asm (“mrs %0, cpsr”
        : “=r” (flags)
        :
        : );
    return flags;
}

The problem is that the compiler doesn’t have any way of knowing that the addition operation in this example should be setting the flags, so it generates instructions similar to:

ldrsh r3, [r0]
ldrsh r4, [r1]
add r3, r3, r4
strh r3, [r2]
mrs r3, cpsr

In order for the CPSR to contain anything useful, I need the compiler to use adds instead of add (s suffix = update CPSR). Is there something I can change in my C code or possibly a compiler option that will cause it to choose flag-updating instructions? I can use either GCC or Clang.


Solution

  • You cannot dictate which instructions the compiler will use. Such an approach is futile and is incompatible with the crucial optimization functions that compilers perform.

    You can obtain portable overflow checking by using compiler builtins supported by both GCC and Clang. For example, __builtin_add_overflow(a, b, &c) stores a+b in c and returns true if overflow occurred. (And it is type-generic; a, b, and c may be any integer types. Whether overflow occurs depends only on the values of a and b and the type of c.)

    You can expect such builtins will participate in optimization, including using flag-updating instructions if they are suitable. (The GCC documentation explicitly states this.)