Search code examples
assemblyx86printfnasmcalling-convention

Why does `printf` with `%hu%` takes 4 bytes from stack instead 2?


I am studying assembly basics and I use printf quite often (which I thought I know good enough from C/C++ experience). I came across weird thing with 2 bytes (16 bits) values:

In 32 bit mode, when using printf("%hu", (unsigned short)123), I thought it would take off 6 bytes from stack (4 for address of format string, and 2 for value). Meanwhile reality is, it 4 for address and 4 for value.

Why is that? Is %hu equal to just %u? Or is it just stripping high 2 bytes of the value?


Some code (from C to ASM compiling, -O0): https://godbolt.org/z/x167zdon1

This works:

        mov     eax, 123
        push    eax
        push    format ; "%hu"
        call    _printf
        add     esp, 4 + 4    ; works

But I thought this should:

        mov     ax, 123
        push    eax
        push    format ; "%hu"
        call    _printf
        add     esp, 4 + 2    ; doesn't work: print gibberish, as it takes 2 bytes more to display the value....

Solution

  • doesn't work ...

    Sure: You performed a push eax, which pushes 4 bytes on the stack. For this reason, you have to remove the 4 bytes from the stack again.

    I came across weird thing with 2 bytes (16 bits) values.

    In x86-32 code, the stack should typically be aligned to 4. This means that the esp register's value shall be a multiple of 4.

    For this reason, most calling conventions require pushing a 32-bit value if the function argument is only a 16- or even an 8-byte value.

    Is %hu equal to just %u? Or is it just stripping high 2 bytes of the value?

    It depends on the calling convention and the library implementation:

    When using a calling convention where the C compiler has to push a 32-bit number whose upper 16 bits are 0 if the argument is a 16-bit value, the library (the printf function) may be implemented in a way that %hu is identical to %u.

    When a calling convention is used where the upper 16 bits of such a number may have any value, the printf function must strip the upper 2 bytes, of course.

    Edit

    Actually, you have two effects in the printf example:

    1. As I have already written, an x86-32 compiler will pass 32-bit values to a function if a char or short argument is expected.
    2. In functions with a variable number of arguments (such as printf), the "additional" arguments of the type char or short must be cast to int by the compiler.

    Let's assume we work on a 16-bit CPU but the data type int is 32 bits long. And we have some function void test(short x, ...);.

    If we call test(a, a); in this situation, the compiler will pass the first argument as 16-bit value (because the argument has the data type short) and the second argument as 32-bit value (because it is an additional argument).

    The same is the case for the additional arguments of printf() - assuming that int is a 32-bit data type.