Search code examples
cassemblystack-overflowcoredump

Strange size change in core dump file


I have the following C snippets that both obviously cause stack overflow error:

a.c

int f(int i) {
    f(i);
}

int main() {
    f(1);
}

b.c

int f(int i) {
    f(i+1); 
}

int main() {
    f(1);
}

After running both and looking at the result produced in coredumpsctl list the output sizes are very different:

Tue 2024-02-20 15:38:28 +0330 420696 1000 1000 SIGSEGV present  /tmp/a  204.2K
Tue 2024-02-20 15:38:30 +0330 420710 1000 1000 SIGSEGV present  /tmp/b  899.7K

The second program's (b.c) core dump size is more than 4 times of the first one (a.c). It was very strange to me since these two programs don't have any significant difference. Can someone explain this behavior?

Edit

I used this command to compile both files:

$ gcc a.c -o a && gcc b.c -o b

The gcc version I used:

$ gcc --version
gcc (Debian 12.2.0-14) 12.2.0

Also assembly generated for a.c (using objdump -S):

0000000000001129 <f>:
    1129:   55                      push   %rbp
    112a:   48 89 e5                mov    %rsp,%rbp
    112d:   48 83 ec 10             sub    $0x10,%rsp
    1131:   89 7d fc                mov    %edi,-0x4(%rbp)
    1134:   8b 45 fc                mov    -0x4(%rbp),%eax
    1137:   89 c7                   mov    %eax,%edi
    1139:   e8 eb ff ff ff          call   1129 <f>
    113e:   90                      nop
    113f:   c9                      leave
    1140:   c3                      ret

0000000000001141 <main>:
    1141:   55                      push   %rbp
    1142:   48 89 e5                mov    %rsp,%rbp
    1145:   bf 01 00 00 00          mov    $0x1,%edi
    114a:   e8 da ff ff ff          call   1129 <f>
    114f:   b8 00 00 00 00          mov    $0x0,%eax
    1154:   5d                      pop    %rbp
    1155:   c3                      ret

And for b.c:

0000000000001129 <f>:
    1129:   55                      push   %rbp
    112a:   48 89 e5                mov    %rsp,%rbp
    112d:   48 83 ec 10             sub    $0x10,%rsp
    1131:   89 7d fc                mov    %edi,-0x4(%rbp)
    1134:   8b 45 fc                mov    -0x4(%rbp),%eax
    1137:   83 c0 01                add    $0x1,%eax
    113a:   89 c7                   mov    %eax,%edi
    113c:   e8 e8 ff ff ff          call   1129 <f>
    1141:   90                      nop
    1142:   c9                      leave
    1143:   c3                      ret

0000000000001144 <main>:
    1144:   55                      push   %rbp
    1145:   48 89 e5                mov    %rsp,%rbp
    1148:   bf 01 00 00 00          mov    $0x1,%edi
    114d:   e8 d7 ff ff ff          call   1129 <f>
    1152:   b8 00 00 00 00          mov    $0x0,%eax
    1157:   5d                      pop    %rbp
    1158:   c3                      ret

Solution

  • The default in systemd's coredump.conf is Compress=yes, according to the man page.
    Presumably that's with zstd or gzip.

    Size depends not just on amount of address-space in use, but on how compressible the data is. Your a has a repeating pattern, the same i in every stack frame, so will compress better. (The saved-RBP will be different every time, but the return address is the same, and the unwritten 12 bytes of the 32-byte frame will be 0 below the first few frames that the _start / dynamic linker code might have dirtied before reaching main.)

    b doesn't: i+1 produces a different value in every stack frame that changes in a different way to the saved RBP.

    And I don't think zstd or gzip look for delta compression of changing patterns, just exact matches. Or if they do look for deltas, maybe having two changing values (saved-RBP and the spilled i) throws that off. Both usually only change in the low byte, i changing by 1, saved-RBP changing by 32 (the size of each stack frame).