I am using perf to profile a node server. The issue is that after I stop the server, perf records a buffer overflow and doesn't write any of the data.
perf record -e cycles:u -g -- npm run start
[ perf record: Woken up 96 times to write data ]
*** buffer overflow detected ***: terminated
I have tried using valgrind's memcheck to look for buffer overflows, but it doesn't report anything, which means that it may be a static buffer overflow. Is there any way for perf to report where the buffer overflow is occuring?
Result of running perf under gdb with breakpoint at __stack_chk_fail
Thread 1 "perf" received signal SIGABRT, Aborted.
__pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0)
at ./nptl/pthread_kill.c:44
44 ./nptl/pthread_kill.c: No such file or directory.
(gdb) bt
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0)
at ./nptl/pthread_kill.c:44
#1 0x00007ffff72d4d2f in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2 0x00007ffff7285ef2 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3 0x00007ffff7270472 in __GI_abort () at ./stdlib/abort.c:79
#4 0x00007ffff72c92d0 in __libc_message (action=action@entry=do_abort,
fmt=fmt@entry=0x7ffff73e3210 "*** %s ***: terminated\n") at ../sysdeps/posix/libc_fatal.c:155
#5 0x00007ffff7361e82 in __GI___fortify_fail (msg=msg@entry=0x7ffff73e31b6 "buffer overflow detected")
at ./debug/fortify_fail.c:26
#6 0x00007ffff7360990 in __GI___chk_fail () at ./debug/chk_fail.c:28
#7 0x00005555557e7ddd in memcpy (__len=40, __src=0x555556a28b38, __dest=0x7fffffff843c)
at /usr/include/x86_64-linux-gnu/bits/string_fortified.h:29
#8 write_buildid (fd=0x7fffffff8590, misc=<optimized out>, pid=-1, bid=0x555556a28b38,
name_len=<optimized out>, name=0x555556a28c0c "/opt/pylon/lib/libpylonbase-6.1.1.so") at util/build-id.c:312
#9 machine__write_buildid_table (machine=machine@entry=0x555555d9bef0, fd=fd@entry=0x7fffffff8590)
at util/build-id.c:361
#10 0x00005555557e865e in perf_session__write_buildid_table (session=session@entry=0x555555d9bd00,
fd=fd@entry=0x7fffffff8590) at util/build-id.c:374
#11 0x000055555581c4b9 in write_build_id (ff=ff@entry=0x7fffffff8590, evlist=evlist@entry=0x555555d96d60)
at util/header.c:320
#12 0x0000555555824fa3 in do_write_feat (evlist=0x555555d96d60, p=<synthetic pointer>, type=2, ff=0x7fffffff8590)
at util/header.c:3224
#13 perf_header__adds_write (fd=3, evlist=0x555555d96d60, header=<optimized out>) at util/header.c:3269
#14 perf_session__write_header (session=<optimized out>, evlist=0x555555d96d60, fd=3, at_exit=at_exit@entry=true)
at util/header.c:3353
#15 0x0000555555760777 in record__finish_output (rec=0x555555b9bb40 <record>) at builtin-record.c:1236
#16 0x0000555555763560 in __cmd_record (rec=0x555555b9bb40 <record>, argv=<optimized out>, argc=<optimized out>)
at builtin-record.c:2026
#17 cmd_record (argc=<optimized out>, argv=<optimized out>) at builtin-record.c:2835
#18 0x00005555557dc8a3 in run_builtin (p=p@entry=0x555555ba6cb8 <commands+216>, argc=argc@entry=8,
argv=argv@entry=0x7fffffffdb90) at perf.c:312
#19 0x000055555574af48 in handle_internal_command (argv=0x7fffffffdb90, argc=8) at perf.c:364
#20 run_argv (argv=<synthetic pointer>, argcp=<synthetic pointer>) at perf.c:408
#21 main (argc=8, argv=0x7fffffffdb90) at perf.c:538
One thing that is interesting is I only get the buffer overflow error when I include a specific C++ add-on that loads a couple of additional shared libraries. If I run the server without that C++ add-on then perf does not report a buffer overflow.
This is a bug in perf. Perf only records the first 20 characters of a build id, with a stack allocated array of 20 characters. One of the libraries I was loading had a buildid that was longer than 20 characters. A bug in the kernel meant that the size would be recorded as longer than 20 characters, and when we tried copying the build id we smash the stack.