Search code examples
rustvalgrindmassif

Heap size for Rust programs very large when measured with valgrind using massif


I'm trying to measure the memory size of a rust program I'm writing. I noticed that when I measure the heap size with the command:

valgrind --tool=massif --pages-as-heap=yes ./program

And measure using ms_print, that the memory size was quite large (intially around 16MB). Eventually, I reduced my rust program to an empty main function:

fn main() {
}

And I compiled, and still had 16MB as my memory size. I noticed that when I used a different machine, the very same binary would be 4MB in total size. One of my friends tried this with the same program on his machine, with the same rust/valgrind version and got 4MB of size as well.

I imagine this is some sort of pre-allocation to memory that might be used in the heap, but I can't figure out any way to control it. I even tried changing my allocator following this guide, but nothing changed.

System details:

OS version       = Ubuntu 18.04
valgrind version = valgrind-3.13.0
cargo version    = cargo 1.39.0-nightly (3f700ec43 2019-08-19)
rustc version    = rustc 1.39.0-nightly (e44fdf979 2019-08-21)
ms_print         = ms_print-3.13.0
libc version     = ldd (Ubuntu GLIBC 2.27-3ubuntu1) 2.27

Solution

  • The place that is leading you in error is pages-as-heap, due to a misconception in how page management on most modern operating systems, particularly linux, works. This will not be true for all platforms and depends on the allocator, the underlying platform and the MMU. In practice, if you platform supports virtual memory, you'll most likely have something like this.

    A page is not always a forcibly reserved area in memory. Most memory functions (mmap, malloc and a few others) will allocate memory, but this will only be considered indicative by the operating system/kernel. You can convince yourself of this with the following test:

    #include <stdio.h>
    #include <stdlib.h>
    
    int main(int argc, char **argv) {
      void *ptr = malloc(1024 * 1024 * 1024);
      sleep(100);
      return 0;
    }
    

    Run it a few times, and...

    :~# free -m
                  total        used        free      shared  buff/cache   available
    Mem:          15930        1716        5553         170        8661       13752
    Swap:             0           0           0
    :~# ./hog &
    [1] 27577
    ...
    [99] 27674
    :~# free -m
                  total        used        free      shared  buff/cache   available
    Mem:          15930        1717        5552         170        8661       13751
    Swap:             0           0           0
    

    You can replicate this test in rust, but you need to go slightly further down than you'd normally go in abstractions to achieve this:

    fn main() {
      let mut vec:Vec<u8> = vec![];
      vec.reserve(1024 * 1024 * 1024);
    }
    

    Memory only matters once it is initialized and accessed. At that point, the OS knows you actually want it, and allocates it all the way down to hardware. Rust is no exception to this - until you're using some of that heap, that heap is just a mmap in the kernel pointing to virtual memory.

    As such, by using the pages-as-heap argument, you are only looking for the "potential" memory before any reallocations are done, not the actual memory used. Remove this parameter and you'll see your program consume 300 or so bytes of heap (which you can easily analyze with valgrind itself).

    The reason your friends are seeing a different output is because their page size is 4kB and yours is 16kB. I'll track down the exact point in the rustc source code later - rust allocates 1024 pages.