Search code examples
linuxprofilingperf

How to collect some readable stack traces with perf?


I want to profile C++ program on Linux using random sampling that is described in this answer:

However, if you're in a hurry and you can manually interrupt your program under the debugger while it's being subjectively slow, there's a simple way to find performance problems.

The problem is that I can't use gdb debugger because I want to profile on production under heavy load and debugger is too intrusive and considerably slows down the program. However I can use perf record and perf report for finding bottlenecks without affecting program performance. Is there a way to collect a number of readable (gdb like) stack traces with perf instead of gdb?


Solution

  • perf does offer callstack recording with three different techniques

    • By default is uses the frame pointer (fp). This is generally supported and performs well, but it doesn't work with certain optimizations. Compile your applications with -fno-omit-frame-pointer etc. to make sure it works well.
    • dwarf uses a dump of the sack for each sample for post-processing. That has a significant performance penalty
    • Modern systems can use hardware-supported last branch record, lbr.

    The stack is accessible in perf analysis tools such as perf report or perf script.

    For more details check out man perf-record.