Search code examples
profilingvalgrindcallgrind

Is callgrind profiling influenced by other processes?


I'd like to profile my application using callgrind. Now, since it takes a very long time, in the meanwhile I go on with web-browsing, compiling and other intensive tasks on the same machine.

Am I biasing the profiling results? I'm expecting that, since valgrind uses a simulated CPU, other external processes should not interfere with valgrind execution. Am I right?


Solution

  • By default, Callgrind does not record anything related to time, so you can expect all collected metrics to (mostly) be independent of other processes on the machine. As the Callgrind manual states,

    By default, the collected data consists of the number of instructions executed, their relationship to source lines, the caller/callee relationship between functions, and the numbers of such calls.

    As such, the metrics Callgrind reports should only depend on what instructions the program is executing on the (simulated) CPU - not on how much time such instructions take. Indeed, many times the output of Callgrind can be somewhat misleading, as the simulated CPU might operate different to the real one (particularly when it comes to branch prediction). The Callgrind paper presented at ICCS 2004 is very clear about this as well:

    We note that the simulation is not able to predict consumed wall clock time, as this would need a detailed simulation of the microarchitecture.

    In any case, however, the simulated CPU is unaffected by what the real CPU is doing. The reason is straightforward. Like you said, your program is not executed on your machine at all. Instead, at runtime, Valgrind dynamically translates your program, that is, it disassembles the binary into "UCode" for an simulated machine, adds analysis code (called instrumentation), then generates binary code that executes the simulation. The addition of analysis code is what makes instruction counting (in Callgrind), memory checking (in Memcheck), and all other plugins possible.

    Therein lies the twist, however. Naturally there are limits to how isolated the program can run in such a dynamic simulation. First, your program might interact with other programs. While the time spent for doing so is irrelevant (as it is not accounted for), the return codes of inter-process communication can certainly change, depending on what else is going on in the system. Second, most system calls need to be run untranslated and their return codes can change as well -- leading to different execution paths of your program and, thus, slightly different metrics being collected. (As an aside, Calgrind offers an option to record the wall clock time spent during syscalls, which will always be affected by what else goes on in the system). More details about these restrictions can be found in the PhD Dissertation of Nicholas Nethercote ("Dynamic Binary Analysis and Instrumentation").