Search code examples
cgcccompilationundefined-behavior

Given the state of the stack and registers, can we predict the outcome of printf's undefined behavior


Here is some simple C code for a class quiz:

#include <stdio.h>

int main() {
  float a = 2.3;
  printf("%d\n", a);
  return 0;
}

Compiled and run on:

Apple LLVM version 6.1.0 (clang-602.0.53) (based on LLVM 3.6.0svn)
Target: x86_64-apple-darwin14.5.0

The output of this code is undefined. I am trying to predict the output by inspecting the memory near a with the debugger (X command in gdb). For example, when the address of a is 0x7fff5fbffb98, then the context near &a is as follows:

0x7fff5fbffb98: 1075000115
0x7fff5fbffb9c: 0
0x7fff5fbffba0: 1606417336
0x7fff5fbffba4: 32767
0x7fff5fbffba8: -1754266167
0x7fff5fbffbac: 32767
0x7fff5fbffbb0: -1754266167
0x7fff5fbffbb4: 32767

Then the output of printf is 1606417352. I know the output when using an incorrect specifier is undefined. Out of curiosity, I expected the output of this undefined behavior to be related to some memory from the running stack or registers, but I have not figured out how to correlate it.

So which address or register is used to set the output of this printf? In other words, given the state of the running stack, and all values from all registers, can we predict (and if so how) the output of this undefined behavior?


Solution

  • On AMD64 with the SysV calling convention (used by nearly every system but Windows), the first few arguments to a function are passed in registers. That's why you don't see them on the stack: They aren't passed on the stack.

    Specifically, the first few integer or pointer arguments are passed in rdi, rsi, rdx, whereas the first few floating point arguments are passed in xmm0, xmm1, and xmm2. Since a is passed in xmm0 but printf attempts to read a number from rsi, you won't see any correlation between the number you supplied and what is printed out.


    For future readers: Please note that what OP attempts to do is undefined behavior. ISO 9899:2011 specifies that an int should be passed for %d, but OP is trying to use it with a double (after default argument promotions). For that, OP should use %f instead. Using the wrong formatting specifier is undefined behaviour. Please do not assume that the observations OP make hold on your system or anywhere and don't write this kind of code.