Search code examples
clinuxglibcmmap

mmap causes stack corruption, kernel involved?


We are getting segfaults with this code:

#include <fcntl.h>
#include <sys/mman.h>
#include <stdio.h>

#define CHUNKSIZE 4096

int main(int argc, char **argv) {
    printf("Hallo!\n"); // does not segfault without this line

    void* first_chunk = mmap(NULL, CHUNKSIZE, PROT_NONE, MAP_SHARED | MAP_ANONYMOUS, 0, 0);
    void* next_chunk_addr = (void*) ((char*)first_chunk + CHUNKSIZE);
    mmap(next_chunk_addr, CHUNKSIZE, PROT_NONE, MAP_SHARED | MAP_FIXED | MAP_ANONYMOUS, 0, 0);

    printf("Bumm!\n"); // segfaults
}

Even if the address of the second mmap-call is invalid, I believe I should get a MAP_FAILED instead of a broken stack.

GDB gives me this:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7a94104 in _IO_file_xsputn () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0  0x00007ffff7a94104 in _IO_file_xsputn () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007ffff7a8ad79 in puts () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x0000000000400591 in main (argc=1, argv=0x7fffffffdf28) at test.cpp:14
(gdb) x/i $rip
=> 0x7ffff7a94104 <_IO_file_xsputn+324>:    mov    %dl,(%r8,%rax,1)

Why is he trying to read from 0x7ffff7ff8000, which has nothing to do with what he is supposed to print?

On another machine, we got this stack trace with related code:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7b0985a in mmap64 () at ../sysdeps/unix/syscall-template.S:81
81  ../sysdeps/unix/syscall-template.S: No such file or directory.

Can this have something to do with the kernel side?

This happens on three different Linux systems with gcc and clang. Nothing happens under OS X.


Solution

  • The kernel is behaving exactly as expected. Take note of this sentence from the mmap(2) documentation on MAP_FIXED:

    If the memory region specified by addr and len overlaps pages of any existing mapping(s), then the overlapped part of the existing mapping(s) will be discarded.

    If you do an strace(1) of the program, you'll see that exactly that is happening:

    ...
    mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0710755000
    write(1, "Hallo!\n", 7)                 = 7
    mmap(NULL, 4096, PROT_NONE, MAP_SHARED|MAP_ANONYMOUS, 0, 0) = 0x7f0710754000
    mmap(0x7f0710755000, 4096, PROT_NONE, MAP_SHARED|MAP_FIXED|MAP_ANONYMOUS, 0, 0) = 0x7f0710755000
    --- SIGSEGV (Segmentation fault) @ 0 (0) ---
    +++ killed by SIGSEGV (core dumped) +++
    

    The first call to printf() (which as you see got optimized into a call to puts() by the compiler) allocates some memory with malloc() (because stdout is buffered), which calls mmap(). Then, the program calls mmap(NULL) and gets the page immediately before the memory allocated by printf. The second call to mmap() allocates a new page on top of that already allocated page, zeroing it out and corrupting malloc's internal data structures. The subsequent call to printf() (actually puts()) then crashes when it accesses those corrupted data structures when it tries to append to the memory buffer it thought it allocated.