Search code examples
valgrindmassif

Monitoring memory consumption over more than 1 hour


I am trying to monitor the memory consumption of a process for a long time with Valgrind's massif. The process is active and does some routine operations at specific time intervals and I would like to see the memory consumption of all the process.

I launch the process with:

valgrind --tool=massif --trace-children=yes <program name> <arguments>

My program is creating a daemon.

I see massif creating a file for the main process that exits almost immediately while the other one is still alive. When I kill the daemon process, massif outputs another file with the pid of the daemon. However, I notice that I'll get this second file only if I let the process run for no more than 15 minutes or so. If I let it run more, no file is generated. Valgrind shows no errors.

I suspect that valgrind is not able to handle such a big amount of information, is that correct? Any suggestion on how I could achieve my objective in any other way?

I am running the latest version of valgrind: 3.12.0


Solution

  • If valgrind encounters a problem (such as an out of memory condition), it is supposed to produce an error message. A possible reason to have no error message when valgrind dies is to have valgrind killed -9 or killed by the OOM.

    What you could do to check this is to use vgdb in a loop in another window, doing something like:

      while true
      do
         vgdb .... valgrind monitor command ...
         sleep 60
      done
    

    As monitor command, you can either use a command to obtain the internal state of the valgrind memory:

         vgdb  v.info stats
    

    or, alternatively when running under massif, you can produce a memory snapshot every 60 seconds or so by using

         vgdb detailed_snapshot filenameXXX
    

    (you need to change the filename for each snapshot)

    See http://www.valgrind.org/docs/manual/manual-core-adv.html#manual-core-adv.gdbserver for more info about vgdb and monitor commands.