I'm using C++ dlopen() to link a shared library named as lib*.so (in directory A) in my main program (in directory B).
I experimented on some simple function loading. Every thing works very well. However, it gave me a headache when I was trying to load class and factory functions that return a pointer to the class object. (I'm using the terms from the tutorial below)
The methodology I used was based on the examples in chapter 3.3 of this tutorial https://www.tldp.org/HOWTO/C++-dlopen/thesolution.html#externC.
There is a bit of polymorphism here ... lib*.so contains a child class that inherits a parent abstract class from the main program directory (directory B). When dlopen() tries to load lib*.so in the main program, it failed due to "undefined symbol".
I used nm command to examine the symbol tables in lib*.so and main program binary. The symbols in these binaries are:
lib*.so : U _ZTI7ParentBox
main program binary: V _ZTI7ParentBox
ParentBox is the name of the parent class inherited by ChildBox in lib*.so. Note that parent class header file is in another project in directory B.
Although there is name mangling the symbol names are exactly the same. I'm just wondering why the dynamic linker cannot link them? and giving me undefeind symbol error for dlopen()?
Am I missing the understanding of some key concepts here?
P.S. more strangely, it was able to resolve the symbols for member functions between the child class (U type symbol) in lib*.so (T type symbol) and parent class. Why is it able to do this but not able to resolve the undefined symbol for parent class name?
(I've been searching around for a long time and tried -rdynamic, -ldl stuff though I'm not fully understood what they are, but nothing worked)
Update 04 April 2019: This is the g++ command line I used to make the main program binary.
g++ -fvisibility=hidden -pthread -static-libgcc -static-libstdc++ \
-m64 -fpic -ggdb3 -fno-var-tracking-assignments -std=c++14 \
-rdynamic \
-o ./build/main-prog \
/some_absolute_path/ParentBox.o \
/some_other_pathen/Triangle.o \
/some_other_pathen/Circle.o \
/some_other_pathen/<lots_of_depending_obj> \
/some_absolute_path/librandom.a \
-lz -ldl -lrt -lbz2
I searched every argument of this command line in https://gcc.gnu.org/onlinedocs/gcc/Option-Index.html (This seems to be a good reference site for all fellow programmers working with large projects with complicated g++ line :) )
Thanks to @Employed Russian. With his instructions, the problem narrows down to export the symbols in main program binary.
However, the main program binary has lots of dependencies as you can see from the above command, Circle, Triangle and lots of other object files. We also need to add "-rdynamic" to the compilation of Circle, Triangle and other dependency object files. Otherwise it does not work.
In my case, I added "-rdynamic" to all files in my project to export all symbols. Not sure about "-fvisibility=hidden" doing anything good. I removed all of them in my Makefile anyway... I know this is not the best way but I will worry about speed later when everything is functionally correct. :)
More Updates: The correct solution is in @Employed Russian's update in the answer. My previous solution happened to work because I also removed "-fvisibility=hidden". It is not necessary (and probably wrong) to add -rdynamic to all objects used in the final link. Please refer to @Employed Russian's explanation which addresses the core issue.
Final Update: For fellow programmers who are interested in how C/C++ program is executed and how library can be linked, here is a good reference web course (Life of Binary) by Xeno Kovah: http://opensecuritytraining.info/LifeOfBinaries.html
You can also find a playlist on youtube. Just search "Life of Binary"
Although there is name mangling the symbol names are exactly the same. I'm just wondering why the dynamic linker cannot link them?
Most likely explanation: the symbol is not exported from the main binary.
Repeat your command with nm -D
:
nm -AD lib*.so main-prog | grep ' _ZTI7ParentBox$'
Chances are, you'll see lib*.so: U _ZTI7ParentBox
and nothing from main-prog
.
This happens because normally the linker will not export any symbol from main-prog
, that is not referenced by some shared library participating in the link (and your lib*.so
isn't linked with main-prog
, or else you wouldn't need to dlopen
it).
To change that behavior, you could add -Wl,--export-dynamic
linker flag when linking main-prog
. That instructs the linker to export everything that is linked into main-prog
.
tried -rdynamic
That is equivalent to -Wl,--export-dynamic
, and should have worked (assuming you added it to the main-prog
link line, and not somewhere else).
Update:
Everything works now! Since main-prog also depends on some other objects, it appears that simply add -rdynamic to the final main-prog linking does not resolve the problem. We need to add "-rdynamic" to the compilation of those depending objects.
That is the wrong solution. Your problem is that -fvisibility=hidden
tells the compiler to mark all symbols that go into main-prog
as not exported, and -rdynamic
doesn't export any hidden symbols.
The correct solution is to remove -fvisibility=hidden
from any objects that define symbols you do want to export, and add -rdynamic
to the final link.