Search code examples
c++linuxsegmentation-faultclang

Bug in Clang code generation for member initializers in objects on x86-64 when page alignment off?


UPDATE3: Clang issue submitted as I am now confident that this is a previously unreported compiler bug. (There are many somewhat similar but distinct issues in the LLVM bug tracker.) Thanks to all who genuinely tried to help.

UPDATE2: It turns out this bug does NOT require the use of -n (--nmagic) option. The same crashes occur on binaries just built without using the C library (-nostdlib Clang option). So building NMAGIC binaries apparently has nothing to do with the problem.

UPDATE: LLVM's own linker, LLD, supports the -n (--nmagic) option so I installed it and gave it a try. The exact same segfault occurs. Since LLVM's own linker supports building NMAGIC binaries that strongly suggests that that it should work when user their compiler (Clang) too. I'll file a bug report.

Original Post:
I've come across a C++ issue where instantiating some objects in programs compiled by Clang segfaults. Thanks in advance to anyone who can help shed light on the issue.

Debugging suggests that Clang is generating SSE movaps instructions to initialize some char arrays and it's these instructions that appear to cause segfaults in some circumstances.

I've tested on multiple Linux systems with the binutils linker with both Clang 16 and Clang 17 with the same results. I'm not sure if the same problem occurs under other x86-64 operating systems or with other linkers. The problem does not occur when using versions of the GCC compiler instead of Clang.

The segfaults occur for some objects under the following minimum set of conditions:

  • x86-64 code generation
  • Optimization is enabled (-O1 or above)
  • [EDIT - This was incorrect. The segfaults do not require the -n (--nmagic) linker option. All that's necessary is to build a binary without the standard C library (-nstdlib option to Clang). I should also point out here that use of the -fno-builtin option makes no difference.]

Here is minimal code to reproduce. Compile with the following on an x86-64 Linux system [EDIT: removed unnecessary linker option]:

$ clang -O1 -nostdlib -fno-stack-protector -static clang_segv.s clang_segv.cc -o clang_segv

clang_segv.cc:

struct SegV
{
    void set(const char *s) { char *b = buf; while ( *s ) { *b++ = *s++; } *b = '\0'; }
    char  buf[128] = "";
    char *cursor   = buf; // needed for segfault
};

int
main()
{
    SegV v;
    v.set("aa");
    return 0;
}

clang_segv.s

.intel_syntax noprefix

.global _start
_start:
    xor rbp,rbp             # Zero stack base pointer

    xor r9,r9
    pop rdi                 # Pop argc off stack -> rdi for 1st arg to main()
    mov rsi,rsp             # Argv @top of stack -> rsi for 2nd arg to main()
    call main               # Call main()... return result ends up in rax

    xor r9,r9
    mov rdi,rax             # Move main()'s return to 1st argument for exit()
    mov rax,231             # exit_group() syscall
    syscall                 # Tell kernel to exit program

This example is the minimum reproducer I could come up with and has no resemblance to the original code where I noticed the problem save that both have objects with char arrays. Changing the code around can mask or unmask the issue which usually suggests a coding error but I'm at a loss to find one in this simple example.

My debugger seems to think that the problems are the instructions that Clang is generating to initialize the 'buf' char array: Image of debugger suggesting issue with movaps instructions.

My Debugger says that Clang generates the following code for the initialization of char buf[128]:

0x400171 xorps  %xmm0,%xmm0
0x400174 movaps %xmm0,-0x10(%rsp)
0x400179 movaps %xmm0,-0x20(%rsp)
0x40017e movaps %xmm0,-0x30(%rsp)
0x400183 movaps %xmm0,-0x40(%rsp)
0x400188 movaps %xmm0,-0x50(%rsp)
0x40018d movaps %xmm0,-0x60(%rsp)
0x400192 movaps %xmm0,-0x70(%rsp)
0x400197 movaps %xmm0,-0x80(%rsp)
0x40019c lea    -0x80(%rsp),%rax

and that the segfault is generated by the first movapps instruction.

Obviously, I expect that the initialization of the array shouldn't segfault.

It doesn't matter if in-class initialization is used as I do here in this example or if initializer-list initialization is used. Both methods suffer from the same problem.

I believe the problem might be a mismatch in how the code generated by clang thinks the members of the object are (should be) aligned versus how the members are actually aligned. I might be wrong about that but if I add alignas(32) to the structure or to the char array the problem goes away. I don't know why I'd need to align at 32 bytes. Perplexingly (to me), aligning at 16 bytes does not mask the problem.

The problem also goes away if I just tell Clang not to generate any SSE instructions with -mno-sse but I'd rather not lose those optimizations. In this case Clang uses movq instructions to initialize the array instead of movaps.

The problem also goes away if I forgo initialization of the char array member and do it manually in the constructor but of course that is less efficient.

At this point to me this looks like a compiler bug. Or am I just using it wrong?

Thanks!


Solution

  • SOLVED: In _start the stack is already aligned so by popping 8 bytes off in my startup code I was actually misaligning it before the call to main(). The SysV x86-64 ABI requires the stack be aligned prior to any calls so it is perfectly reasonable for Clang to assume that the stack is aligned a certain way upon entry to main() and to generate code accordingly. In other words: This is not a bug. (And, again, it has nothing to do with using the --nmagic linker option which is perfectly fine to use.)