Search code examples
memoryvalgrindforkmassiftop-command

Tracking memory usage of piped commands with valgrind


I have a couple processes running a tool I've written that are joined by pipes, and I would like to measure their collected memory usage with valgrind. So far, I have tried something like:

$ valgrind tool=massif trace-children=yes --peak-inaccuracy=0.5 --pages-as-heap=yes --massif-out-file=myProcesses".%p" myProcesses.script

Where myProcesses.script runs the equivalent of my tool foo twice, e.g.:

foo | foo > /dev/null

Valgrind doesn't seem to capture the collected memory usage of this the way I expect. If I use top to track this, I get (for sake of argument) 10% memory usage on the first foo, and then another 10% collects on the second foo before the myProcesses.script completes. This is the sort of thing I want to measure: the usage of both processes. Valgrind instead returns the following error:

Massif: ms_main.c:1891 (ms_new_mem_brk): Assertion 'VG_IS_PAGE_ALIGNED(len)' failed.

Is there a way to collect memory usage data for commands I'm using in a piped fashion (using valgrind)? Or a similar tool that I can use to accurately automate these measurements?

The numbers that top returns while polling seem hand-wavy, to me, and I am seeking accurate and repeatable measurements. If you have suggestions for alternative tools, I would welcome those, as well.

EDIT - Fixed typo with valgrind option.

EDIT 2 - For some reason, it appears that the option --pages-as-heap is giving us troubles with the binaries we're testing. Your examples run fine. A new page is created every time we enter a non-inlined function (stack overflows - heh). We wanted to count those, but they're relatively minor in the scale of memory usage we're testing. (Perhaps there aren't function calls in ls or less?) Removing --pages-as-heap helped get testing working again. Thanks to MrGomez for the great help.


Solution

  • With the correct valgrind version given in the errata, this seems to just work for me in Valgrind 3.6.1. My invocation:

    <me>@harley:/tmp/test$ /usr/local/bin/valgrind --tool=massif \
      --trace-children=yes --peak-inaccuracy=0.5 --pages-as-heap=yes \
      --massif-out-file=myProcesses".%p" ./testscript.sh
    ==21067== Massif, a heap profiler
    ==21067== Copyright (C) 2003-2010, and GNU GPL'd, by Nicholas Nethercote
    ==21067== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
    ==21067== Command: ./testscript.sh
    ==21067==
    ==21068== Massif, a heap profiler
    ==21068== Copyright (C) 2003-2010, and GNU GPL'd, by Nicholas Nethercote
    ==21068== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
    ==21068== Command: /bin/ls
    ==21068==
    ==21070== Massif, a heap profiler
    ==21070== Copyright (C) 2003-2010, and GNU GPL'd, by Nicholas Nethercote
    ==21070== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
    ==21069== Massif, a heap profiler
    ==21069== Copyright (C) 2003-2010, and GNU GPL'd, by Nicholas Nethercote
    ==21069== Using Valgrind-3.6.1 and LibVEX; rerun with -h for copyright info
    ==21069== Command: /bin/sleep 5
    ==21069==
    ==21070== Command: /usr/bin/less
    ==21070==
    ==21068==
    (END) ==21069==
    ==21070==
    ==21067==
    

    The contents of my test script, testscript.sh:

    ls | sleep 5 | less
    

    Sparse contents from one of the files generated by --massif-out-file=myProcesses".%p" (myProcesses.21055):

    desc: --peak-inaccuracy=0.5 --pages-as-heap=yes --massif-out-file=myProcesses.%p
    cmd: ./testscript.sh
    time_unit: i
    #-----------
    snapshot=0
    #-----------
    time=0
    mem_heap_B=110592
    mem_heap_extra_B=0
    mem_stacks_B=0
    heap_tree=empty
    #-----------
    snapshot=1
    #-----------
    time=0
    mem_heap_B=118784
    mem_heap_extra_B=0
    mem_stacks_B=0
    heap_tree=empty
    ...
    #-----------
    snapshot=18
    #-----------
    time=108269
    mem_heap_B=1708032
    mem_heap_extra_B=0
    mem_stacks_B=0
    heap_tree=peak
    n2: 1708032 (page allocation syscalls) mmap/mremap/brk, --alloc-fns, etc.
     n3: 1474560 0x4015E42: mmap (mmap.S:62)
      n1: 1425408 0x4005CAC: _dl_map_object_from_fd (dl-load.c:1209)
       n2: 1425408 0x4007109: _dl_map_object (dl-load.c:2250)
        n1: 1413120 0x400CEEA: openaux (dl-deps.c:65)
         n1: 1413120 0x400D834: _dl_catch_error (dl-error.c:178)
          n1: 1413120 0x400C1E0: _dl_map_object_deps (dl-deps.c:247)
           n1: 1413120 0x4002B59: dl_main (rtld.c:1780)
            n1: 1413120 0x40140C5: _dl_sysdep_start (dl-sysdep.c:243)
             n1: 1413120 0x4000C6B: _dl_start (rtld.c:333)
              n0: 1413120 0x4000855: ??? (in /lib/ld-2.11.1.so)
        n0: 12288 in 1 place, below massif's threshold (01.00%)
      n0: 28672 in 3 places, all below massif's threshold (01.00%)
      n1: 20480 0x4005E0C: _dl_map_object_from_fd (dl-load.c:1260)
       n1: 20480 0x4007109: _dl_map_object (dl-load.c:2250)
        n0: 20480 in 2 places, all below massif's threshold (01.00%)
     n0: 233472 0xFFFFFFFF: ???
    #-----------
    snapshot=19
    #-----------
    time=108269
    mem_heap_B=1703936
    mem_heap_extra_B=0
    mem_stacks_B=0
    heap_tree=empty
    #-----------
    snapshot=20
    #-----------
    time=200236
    mem_heap_B=1839104
    mem_heap_extra_B=0
    mem_stacks_B=0
    heap_tree=empty
    

    Massif continues to complain about heap allocations in the remainder of my files. Note this is very similar to your error.

    I theorize that your version of valgrind was built in debug mode, causing the asserts to fire. A rebuild from source (I used this with the defaults hanging off ./configure) will fix the issue.

    Either way, this seems to be expected with Massif.