I am seeing some strange behaviour in glibc. The code had a bug where it would pass a random pointer to fclose(). I would have expect it to crash at this point, but instead it hangs in pthread_once(), with the below backtrace. The program does not use any threading.
#0 0x000000318180ca38 in pthread_once () from /lib64/libpthread.so.0
#1 0x0000003181109d1c in backtrace () from /lib64/libc.so.6
#2 0x0000003181075d34 in __libc_message () from /lib64/libc.so.6
#3 0x000000318107c6fc in malloc_consolidate () from /lib64/libc.so.6
#4 0x000000318107d719 in _int_malloc () from /lib64/libc.so.6
#5 0x0000003181080a4a in calloc () from /lib64/libc.so.6
#6 0x0000003180c0b0df in _dl_new_object () from /lib64/ld-linux-x86-64.so.2
#7 0x0000003180c061ac in _dl_map_object_from_fd () from /lib64/ld-linux-x86-64.so.2
#8 0x0000003180c08563 in _dl_map_object () from /lib64/ld-linux-x86-64.so.2
#9 0x0000003180c13861 in dl_open_worker () from /lib64/ld-linux-x86-64.so.2
#10 0x0000003180c0f304 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2
#11 0x0000003180c131eb in _dl_open () from /lib64/ld-linux-x86-64.so.2
#12 0x00000031811305d2 in do_dlopen () from /lib64/libc.so.6
#13 0x0000003180c0f304 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2
#14 0x0000003181130692 in __libc_dlopen_mode () from /lib64/libc.so.6
#15 0x0000003181109c05 in init () from /lib64/libc.so.6
#16 0x000000318180ca40 in pthread_once () from /lib64/libpthread.so.0
#17 0x0000003181109d1c in backtrace () from /lib64/libc.so.6
#18 0x0000003181075d34 in __libc_message () from /lib64/libc.so.6
#19 0x000000318107d0b8 in _int_free () from /lib64/libc.so.6
#20 0x000000318106ba6d in fclose@@GLIBC_2.2.5 () from /lib64/libc.so.6
This is on Fedora 19 with glibc-2.17-20.fc19.x86_64, and the program is started from systemd with StandardError=null
, so there's no place for __libc_message() to output an error message to.
I've fixed the code, but is that hang a glibc bug or what?
I'm checking similar backtrace in these days, and I believe this is a glibc bug, which has been there for at least 7 years since Ubuntu 8.04.
Basically this happens after a memory corruption occurs, and, unfortunately, __libc_message allocates memory itself. Since the heap has corrupted, it trys to do backtrace again. Finally it results a deadlock in pthread_once().
=EDITED= I found a tracker for this issue, but it seems to be fixed only in master branch. https://sourceware.org/bugzilla/show_bug.cgi?id=16159