Search code examples
cgccassemblyx86-64red-zone

Incorrect stack red-zoning on x86-64 code generation


This is compiler output from a Linux kernel function (compiled with -mno-red-zone):

load_balance:
.LFB2408:
        .loc 2 6487 0
        .cfi_startproc
.LVL1355:
        pushq   %rbp    #
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp      #,
        .cfi_def_cfa_register 6
        pushq   %r15    #
        pushq   %r14    #
        pushq   %r13    #
        pushq   %r12    #
        .cfi_offset 15, -24
        .cfi_offset 14, -32
        .cfi_offset 13, -40
        .cfi_offset 12, -48
        movq    %rdx, %r12      # sd, sd
        pushq   %rbx    #
.LBB2877:
        .loc 2 6493 0
        movq    $load_balance_mask, -136(%rbp)  #, %sfp
.LBE2877:
        .loc 2 6487 0
        subq    $184, %rsp      #,
        .cfi_offset 3, -56
        .loc 2 6489 0
     ....

Note the "subq $184, %rsp" after the compiler has already spilled to the stack (the spill is insane, btw, since it's spilling a constant value!)

Linus reported this bug to gcc 2 days ago. But I don't understand what the bug is. Why is that subq wrong?

Edit: bug report is here: sorry for not included this before https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61904


Solution

  • I don't understand why that subq is wrong?

    The problem is its order relative to the movq $load_balance_mask, -136(%rbp) instruction. The subq allocates space on the stack by modifying the stack pointer, and the movq writes to a location within that allocated area. But in this case the movq comes before the subq, i.e. it's writing to (as of yet) unallocated stack space. Now what if an interrupt occurs in between the movq and the subq and the interrupt handler tries to touch that same area of the stack? All sorts of weird things could happen as a result, most of which probably would be bad.

    Having the movq first would have been ok in the presence of a red zone. Quoting from wikipedia:

    A red zone is a fixed-size area in memory beyond the stack pointer that has not been "allocated". This region of memory is not to be modified by interrupt/exception/signal handlers. This allows the space to be used for temporary data without the extra overhead of modifying the stack pointer. The x86-64 ABI mandates a 128-byte red zone.

    However, as Linus wrote in the email thread about this bug: "But we build the kernel with -mno-red-zone. We do *not* follow the x86-64 ABI wrt redzoning".
    And with red zones disabled the code generator should not have been allowed to output that movq before the subq.