Search code examples
gccinline-assemblycompiler-bug

Why is gcc not writing to memory correctly when I use __asm__


I have a piece of code that calls a BIOS text output function, INT 0x10, AH=0x13:

static void __stdcall print_raw(uint8_t row, uint8_t col, uint8_t attr,
                               uint16_t length, char const *text)
{
    __asm__ __volatile__ (
        "xchgw %%bp,%%si\n\t"
        "int $0x10\n\t"
        "xchgw %%bp,%%si\n\t"
        :
        : "d" (row | (col << 8)), "c" (length), "b" (attr), "a" (0x13 << 8), "S" (text)
    );
}

I have another function which can print a number on the screen:

void print_int(uint32_t n)
{
    char buf[12];
    char *p;
    p = buf + 12;
    do {
        *(--p) = '0' + (n % 10);
        n /= 10;
    } while (p > buf && n != 0);

    print_raw(1, 0, 0x0F, (buf + 12) - p, p);
}

I spent significant time trying to figure out why nothing was coming up on the screen. I dug into it and looked at the code generated for print_int:

    .globl  print_int
    .type   print_int, @function
print_int:
.LFB7:
    .cfi_startproc
    pushl   %ebp
    .cfi_def_cfa_offset 8
    .cfi_offset 5, -8
    movl    $10, %ecx
    movl    %esp, %ebp
    .cfi_def_cfa_register 5
    pushl   %esi
    pushl   %ebx
    .cfi_offset 6, -12
    .cfi_offset 3, -16
    leal    -8(%ebp), %esi
    leal    -20(%ebp), %ebx
    subl    $16, %esp
    movl    8(%ebp), %eax
.L4:
    xorl    %edx, %edx
    decl    %esi
    divl    %ecx
    testl   %eax, %eax
    je  .L6
    cmpl    %ebx, %esi
    ja  .L4
.L6:
    leal    -8(%ebp), %ecx
    movl    $4864, %eax
    movb    $15, %bl
    movl    $1, %edx
    subl    %esi, %ecx
#APP
# 201 "bootsect.c" 1
    xchgw %bp,%si
    int $0x10
    xchgw %bp,%si

# 0 "" 2
#NO_APP
    addl    $16, %esp
    popl    %ebx
    .cfi_restore 3
    popl    %esi
    .cfi_restore 6
    popl    %ebp
    .cfi_restore 5
    .cfi_def_cfa 4, 4
    ret

If you look closely at the loop at L4, you'll see that it never stores anything into buf. It has omitted the instruction! it just divides it down and never stores any characters into the buffer.


Solution

  • The optimizer can cause this sort of incorrect code when using __asm__ statements. You need to be very careful about your constraints. In this case, the compiler didn't "see" that I was accessing the memory through the pointer in esi, as specified in "S" (text) input constraint.

    The solution is to add a "memory" clobber to the clobber section of the __asm__ statement:

    __asm__ __volatile__ (
        "xchgw %%bp,%%si\n\t"
        "int $0x10\n\t"
        "xchgw %%bp,%%si\n\t"
        :
        : "d" (row | (col << 8)), "c" (length), "b" (attr), "a" (0x13 << 8), "S" (text)
        : "memory"
    );
    

    This tells the compiler that you depend upon memory values, and you may change memory values, so it should be paranoid about making sure memory is up to date before the assembly statement executes, and to make sure not to rely on any memory values it may have cached in registers. It is necessary to prevent the compiler from eliding the stores to buf in my code.