How does perf
determine the load addresses for each loaded image (e.g., shared libraries) during post-processing. For example, perf report
uses this information to make each symbol address relative to the beginning of each loaded image. This is shown in the image below (unwind: _int_malloc...
):
Is it stored somewhere in the elf
binary or profiling output (i.e., perf.data
)?
Shared libraries load address are stored inside the perf.data file recorded during perf record
command. You can use perf script -D
command to dump the data from perf.data in partially decoded format. When your program is loaded by ld-linux*.so.2
(or when required with dlopen), loader will search for library and load its segments using mmap
syscall. These mmap events are recorded by kernel and have type PERF_RECORD_MMAP or PERF_RECORD_MMAP2 in perf.data file. And perf report
(and perf script
) will reconstruct memory offsets to decode symbol names.
$ perf record echo 1
$ perf script -D|grep MMAP -c
7
$ perf script -D|less
PERF_RECORD_MMAP2 ... r-xp /bin/echo
...
PERF_RECORD_MMAP2 ... r-xp /lib/x86_64-linux-gnu/libc-2.27.so
Basic ideas of perf
are described in https://github.com/torvalds/linux/blob/master/tools/perf/design.txt file. To start profiling there is perf_event_open
syscall which has perf_event_attr *attr
argument. Man page describes mmap-related fields of attr:
The perf_event_attr structure provides detailed configuration
information for the event being created.
mmap : 1, /* include mmap data */
mmap_data : 1, /* non-exec mmap data */
mmap2 : 1, /* include mmap with inode data */
Linux kernel in its perf_events
subsystem (kernel/events) will record required events for profiled processes and export the data with fd and mmap to the profiler. perf record
usually dumps this data from kernel into perf.data file without heavy processing (check "Woken up 1 times to write data" prints of your perf record
output). Mmap events in kernel are recorded by perf_event_mmap_output
called from perf_event_mmap_event
which is called from perf_event_mmap
. mmap syscall implementation in mm/mmap.c
has some unconditional calls to perf_event_mmap
.
perf's design.txt mentions munmap, but current implementation has no munmap field or event, event code 2 was reused to PERF_RECORD_LOST. There were ideas that munmap can be helpful https://www.spinics.net/lists/netdev/msg524414.html with links to https://lkml.org/lkml/2016/12/10/1 and https://lkml.org/lkml/2017/1/27/452
perf tool is part of linux kernel sources and can be viewed online with LXR/elixir website: https://elixir.bootlin.com/linux/v5.4/source/tools/perf/
Processing code for mmap/mmap2 events is in perf/util/machine.c machine__process_mmap_event
and machine__process_mmap2_event
; logged mmap arguments, returned address, offset and file name are recorded with help of map__new
and thread__insert_map
for the process (pid/tid) and used later to convert sample event address into symbol name.
PS: Your perf.data has size of 300+ MB, this is huge and processing can be slow. For long running programs you may want to lower perf record event sampling frequency with -F freq
option of perf record
: perf record -F40
or with -c
option.