Search code examples
optimizationgccstack-frame

Is the gcc insane optimisation level (-O3) not insane enough?


As part of answering another question, I wanted to show that the insane level of optimisation of gcc (-O3) would basically strip out any variables that weren't used in main. The code was:

#include <stdio.h>
int main (void) {
   char bing[71];
   int x = 7;
   bing[0] = 11;
   return 0;
}

and the gcc -O3 output was:

    .file "qq.c"
    .text
    .p2align 4,,15
.globl main
    .type main, @function
main:
    pushl %ebp
    xorl %eax, %eax
    movl %esp, %ebp
    popl %ebp
    ret
    .size main, .-main
    .ident "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
    .section .note.GNU-stack,"",@progbits

Now I can see it's removed the local variables but there's still quite a bit of wastage in there. It seems to me that the entire:

    pushl %ebp
    xorl %eax, %eax
    movl %esp, %ebp
    popl %ebp
    ret

section could be replaced with the simpler:

    xorl %eax, %eax
    ret

Does anyone have any idea why gcc does not perform this optimisation? I know that would save very little for main itself but, if this were done with normal functions as well, the effect of unnecessarily adjusting the stack pointer in a massive loop would be considerable.

The command used to generate the assembly was:

gcc -O3 -std=c99 -S qq.c

Solution

  • You can enable that particular optimization with the -fomit-frame-pointer compiler flag. Doing so makes debugging impossible on some machines and substantially more difficult on everything else, which is why it's usually disabled.

    Although your GCC documentation may say that -fomit-frame-pointer is enabled at various optimization levels, you'll likely find that that's not the case—you'll almost certainly have to explicitly enable it yourself.