Search code examples
cassembly

if statement in assembly ouput of c code


i have this simple piece of code in c:

#include <stdio.h>

void test() {}

int main()
{
    if (2 < 3) {

        int zz = 10;
    }
    return 0;
}

when i see the assembly output of this code:

test():
  pushq %rbp
  movq %rsp, %rbp
  nop
  popq %rbp
  ret
main:
  pushq %rbp
  movq %rsp, %rbp
  movl $10, -4(%rbp) // space is created for zz on stack
  movl $0, %eax
  popq %rbp
  ret

i got the assembly from here (default options) I can't see where is the instruction for the conditional check?


Solution

  • The interesting thing here is that gcc and clang optimize away the if() even at -O0, unlike some other compilers (ICC and MSVC).

    gcc -O0 doesn't mean no optimization, it means no extra optimization beyond what's needed to compile at all. But gcc does have to transform through a couple internal representations of the function logic before emitting asm. (GIMPLE and Register Transfer Language). gcc doesn't have a special "dumb mode" where it slavishly transliterates every part of every C expression to asm.

    Even a super-simple one-pass compiler like TCC does minor optimizations within an expression (or even a statement), like realizing that an always-true condition doesn't require branching.

    gcc -O0 is the default, which you obviously used because the dead store to zz isn't optimized away.

    gcc -O0 aims to compile quickly, and to give consistent debugging results.

    • Every C variable exists in memory, whether it's ever used or not.

    • Nothing is kept in registers across C statements (except variables declared register; -O0 is the only time that keyword does anything). So you can modify any C variable with a debugger while single-stepping. i.e. spill/reload everything between separate C statements. See also Why does clang produce inefficient asm with -O0 (for this simple floating point sum)? (This is why benchmarking for -O0 is nonsense: writing the same code with fewer larger expressions is faster only at -O0, not with real settings like -O3).

      Other interesting consequences: constant-propagation doesn't work, see Why does integer division by -1 (negative one) result in FPE? for a case where gcc uses div for a variable set to a constant, vs. something simpler for a literal constant.

    • Every statement is compiled independently, so you can even jump to a different source line (within the same function) using GDB and get consistent results. (Unlike in optimized code where that would be likely to crash or give nonsense, and definitely not match the C abstract machine).

    Given all those requirements for gcc -O0 behaviour, if (2 < 3) can still be optimized to zero asm instructions. The behaviour doesn't depend on the value of any variable, and it's a single statement. There's no way it can ever be not-taken, so the simplest way to compile it is no instructions: fall-through into the { body } of the if.

    Note that gcc -O0's rules / restrictions go far beyond the C as-if rule that the machine-code for a function merely has to implement all externally-visible behaviour of the C source. gcc -O3 optimizes the whole function down to just

    main:                 # with optimization
        xor    eax, eax
        ret
    

    because it doesn't care about keeping asm for every C statement.


    Other compilers:

    See all 4 of the major x86 compilers on Godbolt.

    clang is similar to gcc, but with a dead store of 0 to another spot on the stack, as well as the 10 for zz. clang -O0 is often closer to a transliteration of C into asm, for example it will use div for x / 2 instead of a shift, while gcc uses a multiplicative inverse for division by a constant even at -O0. But in this case, clang also decides that no instructions are sufficient for an always-true condition.

    ICC and MSVC both emit asm for the branch, but instead of the mov $2, %ecx / cmp $3, %ecx you might expect, they both actually do 0 != 1 for no apparent reason:

    # ICC18
        pushq     %rbp                                          #6.1
        movq      %rsp, %rbp                                    #6.1
        subq      $16, %rsp                                     #6.1
    
        movl      $0, %eax                                      #7.5
        cmpl      $1, %eax                                      #7.5
        je        ..B1.3        # Prob 100%                     #7.5
    
        movl      $10, -16(%rbp)                                #9.16
    ..B1.3:                         # Preds ..B1.2 ..B1.1
        movl      $0, %eax                                      #11.12
        leave                                                   #11.12
        ret                                                     #11.12
    

    MSVC uses the xor-zeroing peephole optimization even without optimization enabled.

    It's slightly interesting to look at which local / peephole optimizations compilers do even at -O0, but it doesn't tell you anything fundamental about C language rules or your code, it just tells you about compiler internals and the tradeoffs the compiler devs chose between spending time looking for simple optimizations vs. compiling even faster in no-optimization mode.

    The asm is never intended to faithfully represent the C source in any kind of way that would let a decompiler reconstruct it. Just to implement equivalent logic.