Search code examples
c32bit-64bitstack-traceundefined-behaviorcalling-convention

Calling C function which takes no parameters with parameters


I have some weird question about probably undefined behavior between C calling convention and 64/32 bits compilation. First here is my code:

int f() { return 0; }

int main()
{
    int x = 42;
    return f(x);
}

As you can see I am calling f with an argument while f takes no parameters. My first question was does this argument is really given to f while calling it.

The mysterious lines

After a little objdump I obtained curious results. While passing x as argument of f:

00000000004004b6 <f>:
  4004b6:   55                      push   %rbp
  4004b7:   48 89 e5                mov    %rsp,%rbp
  4004ba:   b8 00 00 00 00          mov    $0x0,%eax
  4004bf:   5d                      pop    %rbp
  4004c0:   c3                      retq   

00000000004004c1 <main>:
  4004c1:   55                      push   %rbp
  4004c2:   48 89 e5                mov    %rsp,%rbp
  4004c5:   48 83 ec 10             sub    $0x10,%rsp
  4004c9:   c7 45 fc 2a 00 00 00    movl   $0x2a,-0x4(%rbp)
  4004d0:   8b 45 fc                mov    -0x4(%rbp),%eax
  4004d3:   89 c7                   mov    %eax,%edi
  4004d5:   b8 00 00 00 00          mov    $0x0,%eax
  4004da:   e8 d7 ff ff ff          callq  4004b6 <f>
  4004df:   c9                      leaveq 
  4004e0:   c3                      retq   
  4004e1:   66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
  4004e8:   00 00 00 
  4004eb:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)

Without passing x as a argument:

00000000004004b6 <f>:
  4004b6:   55                      push   %rbp
  4004b7:   48 89 e5                mov    %rsp,%rbp
  4004ba:   b8 00 00 00 00          mov    $0x0,%eax
  4004bf:   5d                      pop    %rbp
  4004c0:   c3                      retq   

00000000004004c1 <main>:
  4004c1:   55                      push   %rbp
  4004c2:   48 89 e5                mov    %rsp,%rbp
  4004c5:   48 83 ec 10             sub    $0x10,%rsp
  4004c9:   c7 45 fc 2a 00 00 00    movl   $0x2a,-0x4(%rbp)
  4004d0:   b8 00 00 00 00          mov    $0x0,%eax
  4004d5:   e8 dc ff ff ff          callq  4004b6 <f>
  4004da:   c9                      leaveq 
  4004db:   c3                      retq   
  4004dc:   0f 1f 40 00             nopl   0x0(%rax)

So as we can see:

  4004d0:   8b 45 fc                mov    -0x4(%rbp),%eax
  4004d3:   89 c7                   mov    %eax,%edi

happen when I call f with x but because I am not really good in assembly I don't really understand these lines.

The 64/32 bits paradoxe

Otherwise I tried something else and start printing the stack of my program.

Stack with x given to f (compiled in 64bits):

Address of x: ffcf115c
  ffcf1128:          0          0
  ffcf1130:   -3206820          0
  ffcf1138:   -3206808  134513826
  ffcf1140:         42   -3206820
  ffcf1148: -145495616  134513915
  ffcf1150:          1   -3206636
  ffcf1158:   -3206628         42
  ffcf1160: -143903780   -3206784

Stack with x not given to f (compiled in 64bits):

Address of x: 3c19183c
  3c191818:          0          0
  3c191820: 1008277568      32766
  3c191828:    4195766          0
  3c191830: 1008277792      32766
  3c191838:          0         42
  3c191840:    4195776          0

And for some reason in 32bits x seems to be push on the stack.

Stack with x given to f (compiled in 32bits):

Address of x: ffdc8eac
  ffdc8e78:          0          0
  ffdc8e80:   -2322772          0
  ffdc8e88:   -2322760  134513826
  ffdc8e90:         42   -2322772
  ffdc8e98: -145086016  134513915
  ffdc8ea0:          1   -2322588
  ffdc8ea8:   -2322580         42
  ffdc8eb0: -143494180   -2322736

Why the hell does x appear in 32 but not 64 ???

Code for printing: http://paste.awesom.eu/yayg/QYw6&ln

Why am I asking such stupid questions ?

  • First because I didn't found any standard that answer to my question
  • Secondly, think about calling a variadic function in C without the count of arguments given.
  • Last but not least, I think undefined behavior is fun.

Thank you for taking the time to read until here and for helping me understanding something or making me realize that my questions are pointless.


Solution

  • The answer is that, as you suspect, what you are doing is undefined behavior (in the case where the superfluous argument is passed).

    The actual behavior in many implementations is harmless, however. An argument is prepared on the stack, and is ignored by the called function. The called function is not responsible for removing arguments from the stack, so there no harm (such as an unbalanced stack pointer).

    This harmless behavior was what enabled C hackers to develop, once upon a time, a variable argument list facility that used to be under #include <varargs.h> in ancient versions of the Unix C library.

    This evolved into the ANSI C <stdarg.h>.

    The idea was: pass extra arguments into a function, and then march through the stack dynamically to retrieve them.

    That won't work today. For instance, as you can see, the parameter is not in fact put into the stack, but loaded into the RDI register. This is the convention used by GCC on x86-64. If you march through the stack, you won't find the first several parameters. On IA-32, GCC passes parameters using the stack, by contrast: though you can get register-based behavior with the "fastcall" convention.

    The va_arg macro from <stdarg.h> will correctly take into account the mixed register/stack parameter passing convention. (Or, rather, when you use the correct declaration for a variadic function, it will perhaps suppress the passage of the trailing arguments in registers, so that va_arg can just march through memory.)

    P.S. your machine code might be easier to follow if you added some optimization. For instance, the sequence

      4004c9:   c7 45 fc 2a 00 00 00    movl   $0x2a,-0x4(%rbp)
      4004d0:   8b 45 fc                mov    -0x4(%rbp),%eax
      4004d3:   89 c7                   mov    %eax,%edi
      4004d5:   b8 00 00 00 00          mov    $0x0,%eax
    

    is fairly obtuse due to what look like some wasteful data moves.