Question for the linker gurus out there. I have been working with mex files in Matlab and have been getting awful lot of unexplained crashes so I want to dig a bit deeper.
Can you explain to me how static data is allocated (deallocated) in a process' virtual memory space when a dynamic module is being loaded (unloaded)?
I assume this takes place in the _init()
and _fini()
functions. However does the BSS segment get assigned a chunk of memory in the heap space, along with other dynamic memory allocations?
What about global data in a dynamic module? Would there be possibility of symbol name clashes with the primary executable?
Thanks for shedding light on these issues. If I have to choose a platform I would like to hear from the ELF experts since I do most of my development on Linux.
Can you explain to me how static data is allocated (deallocated) in a process' virtual memory space when a dynamic module is being loaded (unloaded)?
That part is easy: every ELF
file has PT_LOAD
segments, which you can see in the output from readelf -Wl foo.so
. When loading the shared object, each of these segments is mmap
ed into address space, and that serves as "allocation" for any static data in that shared object.
When foo.so
is unloaded, the data (and code) is disposed of via munmap
system call.
I assume this takes place in the _init() and _fini() functions
That assumption is not correct. The _init
and _fini
are about dynamic initialization (e.g. global variables of class type in C++
with a non-trivial constructor/destructor). By the time _init
is called, the memory for all globals has already been "reserved" via mmap
.
However does the BSS segment
The .bss
section is included in the same PT_LOAD
segment in which other initialized (writable) data is. This is why there is a separate p_filesz
and p_memsz
in the ElfXX_Phdr
: the p_filesz
"covers" initialized data, and (larger) p_memsz
causes the mmap
to "allocate" space for both initialized and .bss
data.
What about global data in a dynamic module?
What about it? I covered initialized data above.
Would there be possibility of symbol name clashes with the primary executable?
Certainly. You can define int foo = 42;
in a.out
, and int foo = 24;
in foo.so
. The usual rule is that if foo
is visible in the dynamic symbol table of a.out
, then that foo
will be used regardless of where it is referenced from.
Complications arise when a.out
does not export foo
(if it is not linked with -rdymamic
and does not link against foo.so
), or when foo.so
is linked with -Bsymbolic
.