Search code examples
coptimizationgccllvm-clang

Why does the compiler generate code that writes the same stuff in the same memory location over and over again?


I compiled the following code using clang and gcc and called both with -O3:

#include <stdio.h>
#include <stdlib.h>

static void a(int n) {
  if (n == 0) return;

  printf("descending; a=%i\n", n);
  a(n-1);
}

int main() {
  a(5);
  return 0;
}

Here is the main function gcc generated (without NOPs at the end):

08048310 <main>:
 8048310:   55                      push   %ebp
 8048311:   89 e5                   mov    %esp,%ebp
 8048313:   83 e4 f0                and    $0xfffffff0,%esp
 8048316:   83 ec 10                sub    $0x10,%esp
 8048319:   c7 44 24 04 05 00 00    movl   $0x5,0x4(%esp)
 8048320:   00 
 8048321:   c7 04 24 14 85 04 08    movl   $0x8048514,(%esp)
 8048328:   e8 c7 ff ff ff          call   80482f4 <printf@plt>
 804832d:   c7 44 24 04 04 00 00    movl   $0x4,0x4(%esp)
 8048334:   00 
 8048335:   c7 04 24 14 85 04 08    movl   $0x8048514,(%esp)
 804833c:   e8 b3 ff ff ff          call   80482f4 <printf@plt>
 8048341:   c7 44 24 04 03 00 00    movl   $0x3,0x4(%esp)
 8048348:   00 
 8048349:   c7 04 24 14 85 04 08    movl   $0x8048514,(%esp)
 8048350:   e8 9f ff ff ff          call   80482f4 <printf@plt>
 8048355:   c7 44 24 04 02 00 00    movl   $0x2,0x4(%esp)
 804835c:   00 
 804835d:   c7 04 24 14 85 04 08    movl   $0x8048514,(%esp)
 8048364:   e8 8b ff ff ff          call   80482f4 <printf@plt>
 8048369:   c7 44 24 04 01 00 00    movl   $0x1,0x4(%esp)
 8048370:   00 
 8048371:   c7 04 24 14 85 04 08    movl   $0x8048514,(%esp)
 8048378:   e8 77 ff ff ff          call   80482f4 <printf@plt>
 804837d:   31 c0                   xor    %eax,%eax
 804837f:   c9                      leave  
 8048380:   c3                      ret    

And here's the one from clang:

080483d0 <main>:
 80483d0:   55                      push   %ebp
 80483d1:   89 e5                   mov    %esp,%ebp
 80483d3:   56                      push   %esi
 80483d4:   83 ec 0c                sub    $0xc,%esp
 80483d7:   be 05 00 00 00          mov    $0x5,%esi
 80483dc:   8d 74 26 00             lea    0x0(%esi,%eiz,1),%esi
 80483e0:   89 74 24 04             mov    %esi,0x4(%esp)
 80483e4:   c7 04 24 e0 84 04 08    movl   $0x80484e0,(%esp)
 80483eb:   e8 04 ff ff ff          call   80482f4 <printf@plt>
 80483f0:   4e                      dec    %esi
 80483f1:   75 ed                   jne    80483e0 <main+0x10>
 80483f3:   31 c0                   xor    %eax,%eax
 80483f5:   83 c4 0c                add    $0xc,%esp
 80483f8:   5e                      pop    %esi
 80483f9:   5d                      pop    %ebp
 80483fa:   c3                      ret    

My question is: Is there a good reason why they both generate code that writes the address of the static string on the stack over and over again? For example, why doesn't the code clang generates look like this instead?

080483d0 <main>:
 80483d0:   55                      push   %ebp
 80483d1:   89 e5                   mov    %esp,%ebp
 80483d3:   56                      push   %esi
 80483d4:   83 ec 0c                sub    $0xc,%esp
 80483d7:   be 05 00 00 00          mov    $0x5,%esi
 80483dc:   8d 74 26 00             lea    0x0(%esi,%eiz,1),%esi
 80483e0:   c7 04 24 e0 84 04 08    movl   $0x80484e0,(%esp)
 80483e7:   89 74 24 04             mov    %esi,0x4(%esp)
 80483eb:   e8 04 ff ff ff          call   80482f4 <printf@plt>
 80483f0:   4e                      dec    %esi
 80483f1:   xx xx                   jne    80483e7 <main+0x17>
 80483f3:   31 c0                   xor    %eax,%eax
 80483f5:   83 c4 0c                add    $0xc,%esp
 80483f8:   5e                      pop    %esi
 80483f9:   5d                      pop    %ebp
 80483fa:   c3                      ret    

Solution

  • In the Mac OS X Application Binary Interface, the parameters passed on the stack may be modified by the called routine. So, where the calling routine puts the address of the format string on the stack, the called routine is allowed to write to that place on the stack. (It is not permitted generally to write to higher [earlier] locations on the stack.) Therefore, the calling routine cannot know that, after the called routine returns, the parameters the calling routine wrote to the stack are unchanged.

    This is a convenience for routines that might modify their arguments while using them to perform calculations. E.g., you could have code such as:

    int foo(int x)
    {
        x *= x;
        return x+3;
    }
    

    Obviously, for code this simple, the compiler would not actually need to store the product in x in order to finish computing the return value. However, with more complicated routines, you can see where the compiler might decide to store a value to x.

    As an ABI design issue, you could wonder whether it might be better to prohibit the called routine from using this space, so that the caller could rely on the values not changing. However, using a parameter repeatedly this way is not hugely common, and the cost of writing a new copy is tiny.

    Also note that only the address of the format string, which was written to the stack, may be changed. C semantics require that the contents of the format string remain unchanged by the called routine, since it is const char.