Search code examples
clinuxshared-memoryfallocate

Using fallocate() after shm_open() results in memory not being freed after shm_unlink()


I have an application that uses shared memory with memory mapped files. The target operating system is Ubuntu 14.04 (64-bit). The Linux kernel on this distro is at version 4.4.0. gcc is at version 4.8.4.

Until recently I was using the following function calls (in the shown order) to allocate and deallocate the shared memory.

shm_open
ftruncate
mmap
/* use shared memory */
munmap
shm_unlink

This approach has the problem that it does not detect if there is enough memory available for the shared memory. The application will crash at a later point with a SIGBUS signal when the shared memory is accessed.

I have found people had the same issue here and they solved it by using fallocate() instead of ftruncate(). fallocate() will return an error if there is not enough memory available for the requested size.

I have implemented the same in my application and fallocate() can properly detect the situation when not enough memory is available However, I am now running into a different problem.

The problem is that the memory reserved by fallocate() is not freed after calling shm_unlink(). This was not an issue when using ftruncate().

Consider the following minimal example (fallocate.c) that exhibit this behavior.

#include <stdio.h>
#include <string.h>

#include <errno.h>
#include <sys/mman.h>
#include <fcntl.h>

static const char* name = "/test";
static const size_t size = (size_t)4*1024*1024*1024;

int main ()
{
    int fd = shm_open(name, O_RDWR | O_CREAT | O_EXCL, S_IRWXU | S_IRWXG | S_IRWXO);
    if (fd == -1) {
        printf("shm_open failed (%s)\n", strerror(errno));
        return 1;
    }

    if (fallocate(fd, 0, 0, size)) {
        printf("fallocate failed (%s)\n", strerror(errno));
        return 1;
    }

    if (shm_unlink(name)) {
        printf("shm_unlink failed (%s)\n", strerror(errno));
        return 1;
    }

    printf("break here to check if memory still used\n");

    return 0;
}

I used the following CMakeLists.txt for compiling

add_executable(fallocate fallocate.c)
target_compile_definitions(fallocate PRIVATE _GNU_SOURCE)
target_link_libraries(fallocate PRIVATE rt)

Run this example in gdb and break on the last printf statement. You will see the following behavior.

  • The test file is no longer present in /dev/shm
  • The memory is still in the "used" category when looking at the top output; it will only move to the "free" category once the process terminates

Is this expected behavior or am I using the API wrongly?


Edit: as requested the process address space after shm_unlink() (using gets() after shm_unlink() to hold the process)

Output of cat /proc/<PID>/status

Name:   fallocate
State:  S (sleeping)
Tgid:   12445
Ngid:   0
Pid:    12445
PPid:   26349
TracerPid:      0
Uid:    1001    1001    1001    1001
Gid:    1001    1001    1001    1001
FDSize: 256
Groups: 4 27 108 124 999 1001 1002
NStgid: 12445
NSpid:  12445
NSpgid: 12445
NSsid:  26349
VmPeak:     8628 kB
VmSize:     8460 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:       840 kB
VmRSS:       840 kB
VmData:       80 kB
VmStk:       132 kB
VmExe:         4 kB
VmLib:      2052 kB
VmPTE:        36 kB
VmPMD:        12 kB
VmSwap:        0 kB
HugetlbPages:          0 kB
Threads:        1
SigQ:   0/61795
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 0000000180000000
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000
Seccomp:        0
Speculation_Store_Bypass:       thread vulnerable
Cpus_allowed:   ff
Cpus_allowed_list:      0-7
Mems_allowed:   00000000,00000001
Mems_allowed_list:      0
voluntary_ctxt_switches:        1
nonvoluntary_ctxt_switches:     2

Output of pmap <PID>

0000000000400000      4K r-x-- fallocate
0000000000600000      4K r---- fallocate
0000000000601000      4K rw--- fallocate
00007f1e92093000    100K r-x-- libpthread-2.19.so
00007f1e920ac000   2044K ----- libpthread-2.19.so
00007f1e922ab000      4K r---- libpthread-2.19.so
00007f1e922ac000      4K rw--- libpthread-2.19.so
00007f1e922ad000     16K rw---   [ anon ]
00007f1e922b1000   1784K r-x-- libc-2.19.so
00007f1e9246f000   2048K ----- libc-2.19.so
00007f1e9266f000     16K r---- libc-2.19.so
00007f1e92673000      8K rw--- libc-2.19.so
00007f1e92675000     20K rw---   [ anon ]
00007f1e9267a000     28K r-x-- librt-2.19.so
00007f1e92681000   2044K ----- librt-2.19.so
00007f1e92880000      4K r---- librt-2.19.so
00007f1e92881000      4K rw--- librt-2.19.so
00007f1e92882000    140K r-x-- ld-2.19.so
00007f1e92a75000     16K rw---   [ anon ]
00007f1e92aa3000      4K rw---   [ anon ]
00007f1e92aa4000      4K r---- ld-2.19.so
00007f1e92aa5000      4K rw--- ld-2.19.so
00007f1e92aa6000      4K rw---   [ anon ]
00007ffe6f72b000    132K rw---   [ stack ]
00007ffe6f7ee000     12K r----   [ anon ]
00007ffe6f7f1000      8K r-x--   [ anon ]
ffffffffff600000      4K r-x--   [ anon ]
 total             8464K

Solution

  • You're not closing the open file descriptor, and the shared-memory "file" is likely in a tmpfs-memory based filesystem (assuming Linux).

    This code creates a file:

    int fd = shm_open(name, O_RDWR | O_CREAT | O_EXCL, S_IRWXU | S_IRWXG | S_IRWXO);
    

    This code makes it big (4 GB):

    if (fallocate(fd, 0, 0, size)) {
    

    This code just unlinks it from the filesystem:

    if (shm_unlink(name)) {
    

    At that point, the open file descriptor means the backing file still exists even though it's been removed from the directory with its name. (That's literally what "unlink" means.) Such a file will not be actually removed from the filesystem until the last link to the file is closed - and that last link is your process's open file descriptor.

    Add

    close( fd );
    

    and check system memory usage before and after the close() call.