Search code examples
clinux64-bitprintfcalling-convention

printf: how to explain corrupted result


#include <stdio.h>

int main(void)
{
        double resd = 0.000116;
        long long resi = 0;

        printf("%lld %f %lld %f\n", resd, resd, resi, resi);
        return 0;
}

gives (Linux, gcc, x64)

0 0.000116 0 0.000116
             ^^^^^^^^ odd, since the memory for resi is zeroed

Actually, compiled with g++ it gives random results instead of the second 0.

I understand I gave invalid specifiers to printf and that it triggers unspecified undefined behavior, but I wonder why this specific corruption occurs, since long long and double have the same size.


Solution

  • I get the same results as you do on my machine (Mac OS X, so AMD/Linux ABI). The floating point parameters are passed in XMM registers and the integer parameters in integer registers. When printf grabs them using va_arg, it pulls from XMM when it sees the %f format, and from the other registers when it sees %lld. Here's the disassembly for your program as compiled (-O0) on my machine:

     1 _main:
     2   pushq   %rbp
     3   movq    %rsp,%rbp
     4   subq    $0x20,%rsp
     5   movq    $0x3f1e68a0d349be90,%rax
     6   move    %rax,0xf8(%rbp)
     7   movq    $0x00000000,0xf0(%rbp)
     8   movq    0xf0(%rbp),%rdx
     9   movq    0xf0(%rbp),%rsi
    10   movsd   0xf8(%rbp),%xmm0
    11   movq    0xf8(%rbp),%rax
    12   movapd  %xmm0,%xmm1
    13   movq    %rax,0xe8(%rbp)
    14   movsd   0xe8(%rbp),%xmm0
    15   lea     0x0000001d(%rip),%rdi
    16   movl    $0x00000002,%eax
    17   callq   0x100000f22    ; symbol stub for: _printf
    18   movl    $0x00000000,%eax
    19   leave
    20   ret
    

    There you can see what's going on - the format string is passed in %rdi, then your parameters are passed (in order) in: %xmm0, %xmm1, %rsi, and %rdx. When printf gets them, it pops them off in a different order (the order specified in your format string). That means it pops them: %rsi, %xmm0, %rdx, %xmm1, giving the results you see. The 2 in %eax is to indicate the number of floating point arguments passed.

    Edit:

    Here's an optimized version - in this case the shorter code might be easier to understand. The explanation is the same as above, but with a little less boilerplate noise. The floating point value is loaded by the movsd on line 4.

     1 _main:
     2    pushq   %rbp
     3    movq    %rsp,%rbp
     4    movsd   0x00000038(%rip),%xmm0
     5    xorl    %edx,%edx
     6    xorl    %esi,%esi
     7    movaps  %xmm0,%xmm1
     8    leaq    0x00000018(%rip),%rdi
     9    movb    $0x02,%al
    10    callq   0x100000f18   ; symbol stub for: _printf
    11    xorl    %eax,%eax
    12    leave
    13    ret