Search code examples
cassemblyx86-64calling-convention

Why does this code print different values compiled by clang and gcc?


asm.s:

.intel_syntax noprefix
.global Foo
Foo:
    mov ax, 146
    ret

main.c:

#include <stdio.h>

extern int Foo(void);

int main(int argc, char** args){
    printf("Asm returned %d\n", Foo());
    return 0;
}

Now I compile and link:

(compiler name) -c asm.s -o asm.o
(compiler name) asm.o main.c -o  main
./main

I'm using Windows and the x64 windows binary of LLVM.

GCC prints 146 (as expected), but code produced by clang prints a random value, why? There are apparently some issues with debugging clang binaries with gdb on Windows, so I can't provide any gdb logs. The GCC binary is doing what I expect it to, but not Clang.


Solution

  • You want mov eax, 146 to match a return type of 32-bit or wider.

    See How do AX, AH, AL map onto EAX? - writing EAX zero-extends into RAX, but 8 and 16-bit partial registers keep the legacy 386 behaviour of just merging, unfortunately.

    You told the compiler it returns an int (4 bytes in x86 and x86-64 calling conventions), but you only modified the low 16 bits of EAX/RAX, leaving existing garbage in the high 16 of the 32-bit int return value which the caller looks for in EAX.

    sizeof(int) == 4 in all 32 and 64-bit calling conventions for x86 so it returns in EAX.
    16-bit AX is a short or unsigned short in C. (Or int16_t / uint16_t)

    If you declared the return type as short, the caller would only look in AX, ignoring high garbage as required by the calling convention for narrow return values (unlike for args narrower than 32-bit, as an unwritten extension to at least x86-64 System V, maybe also Windows x64, which clang depends on).

    See also How to remove "noise" from GCC/clang assembly output? for how to look at compiler-generated code to see correct asm for functions like this.


    The low 16 bits of your return value are 146. You can see this if you look at it in hex, e.g. xxxx0092.

    Probably GCC happens to call it with the high half of EAX already zero, clang doesn't. With different code in the caller, GCC might have used RAX for a non-small value as well. It's just luck for the C caller and optimization options you happened to test with.

    The C equivalent is int retval; memcpy(&retval, &tmp, 2) except using registers, not replacing the uninitialized garbage in the high half of retval.