Search code examples
gccgcc-warning

Can gcc omit reserving data on the stack?


I'm using gcc 12.2.0 on x86_64 and compiling x64 code on there. I've run into an odd issue that is causing me problems and have reduced it down to a minimal reproducer:

#include <stdint.h>
#include <stdbool.h>

struct foobar_t {
    uint8_t data[512];
};

void my_memset(void *target) {
#if 1
    for (int i = 0; i < 256; i++) {
        ((uint16_t*)target)[i] = 0xabcd;
    }
#else
    for (int i = 0; i < 512; i++) {
        ((uint8_t*)target)[i] = 0xab;
    }
#endif
}

int main() {
    struct foobar_t foobar;
    my_memset(&foobar);
    if (foobar.data[123] == 0) {
        volatile int x = 0;
    }
    return 0;
}

When the #if 1 path is taken, I get a compiler warning:

$ gcc -O3 -fno-stack-protector -Wall -c -o x.o x.c
[...]
x.c:46:24: warning: ‘foobar’ is used uninitialized [-Wuninitialized]
   46 |         if (foobar.data[123] == 0) {

That error completely disappears when I use the second code path (#if 0) where the only difference is that in the first there's 256 16-bit words set while in the second there are 512 bytes set.

In the case that I get the warning, the generated assembly also looks wrong:

0000000000000000 <my_memset>:
   0:   f3 0f 1e fa             endbr64
   4:   66 0f 6f 05 00 00 00    movdqa 0x0(%rip),%xmm0        # c <my_memset+0xc>
   c:   48 8d 87 00 02 00 00    lea    0x200(%rdi),%rax
  13:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
  18:   0f 11 07                movups %xmm0,(%rdi)
  1b:   48 83 c7 10             add    $0x10,%rdi
  1f:   48 39 f8                cmp    %rdi,%rax
  22:   75 f4                   jne    18 <my_memset+0x18>
  24:   c3                      ret


0000000000000030 <main>:
  30:   f3 0f 1e fa             endbr64
  34:   48 81 ec a0 01 00 00    sub    $0x1a0,%rsp
  3b:   66 0f 6f 05 00 00 00    movdqa 0x0(%rip),%xmm0        # 43 <main+0x13>
  43:   48 8d 44 24 98          lea    -0x68(%rsp),%rax
  48:   48 8d 94 24 98 01 00    lea    0x198(%rsp),%rdx
  50:   0f 29 00                movaps %xmm0,(%rax)
  53:   48 83 c0 10             add    $0x10,%rax
  57:   48 39 c2                cmp    %rax,%rdx
  5a:   75 f4                   jne    50 <main+0x20>
  5c:   80 7c 24 13 00          cmpb   $0x0,0x13(%rsp)
  61:   75 08                   jne    6b <main+0x3b>
  63:   c7 44 24 94 00 00 00    movl   $0x0,-0x6c(%rsp)
  6b:   31 c0                   xor    %eax,%eax
  6d:   48 81 c4 a0 01 00 00    add    $0x1a0,%rsp
  74:   c3                      ret

This only reserves 0x1a0 bytes on the stack, 416 bytes. That does not fit the structure! How can that be? What is the reason for this happening?

I've tried removing as much code as possible while still retaining the warning. If I disable optimization, the warning also goes away.


Solution

  • Your #if 1 code is illegal (undefined behavior) because it violates the strict aliasing rule. Very roughly speaking, subject to certain narrow exceptions, you must not access the same memory through pointers to two different types.

    As such, the compiler is entitled to assume that accesses to memory through one pointer type aren't seen by accesses through another pointer type. So it's not surprising that it would think that foobar is uninitialized, since it doesn't consider the possibility that an access to a uint16_t object could touch it.

    There is an exception in the standard for character types, precisely so that you can implement things like memset and memcpy using character pointers. So your #else code is legal, and in fact the compiler is able to recognize that the my_memset code does initialize foobar, and so you don't get the warning. (Strictly speaking your code ought to use unsigned char instead of uint8_t - they are typedef'd the same on most compilers, but the language standard does not guarantee that to be the case.)


    The thing about "insufficient stack" is actually normal and not a problem. The object foobar is located on the stack from offset rsp-0x68 to rsp+0x198 which is precisely 512 bytes, just as it should be. It may look strange that part of it is below the stack pointer, but this is okay because it is within the 128-byte red zone.

    The red zone is only usable in leaf functions (i.e. those which don't call other functions), so it can only be used in main if the call to my_memset is inlined. This isn't done when optimizations are off, so you don't see the red zone used in that case.

    Using the red zone doesn't really accomplish much in this example. The main benefit is in functions where, by using the red zone, you avoid having to adjust the stack pointer at all. Here, the stack pointer would have to be adjusted anyway, so we haven't gained anything in comparison to the more natural implementation of subtracting a full 512 bytes from the stack pointer. But the code with the red zone is still perfectly valid and equivalent in terms of performance, it just looks funny. So this is just a slightly odd quirk of the compiler's stack layout algorithm.