For context: I have a Java project that is partially implemented with two JNI libraries. For the sake of example, libbar.so
depends on libfoo.so
. If these were system libraries,
System.loadLibrary("bar");
would do the trick. But since they're custom libraries I'm shipping with my JAR, I have to do something like
System.load("/path/to/libfoo.so");
System.load("/path/to/libbar.so");
libfoo needs to go first because otherwise libbar
can't find it, as it's not in the system library search path.
This has been working well for a while, but I've now run into an issue where std::any_cast
is throwing std::bad_any_cast
despite the types being correct. I tracked it down to the fact that both libraries have a different definition of the typeinfo for that type, and they're not being merged at runtime. This seems to be because System.load()
ends up invoking dlopen()
with RTLD_LOCAL
rather than RTLD_GLOBAL
.
I wrote this to demonstrate the behaviour without needing JNI:
foo.hpp
class foo { }; extern "C" const void* libfoo_foo_typeinfo();
foo.cpp
#include "foo.hpp" #include <typeinfo> extern "C" const void* libfoo_foo_typeinfo() { return &typeid(foo); }
bar.cpp
#include "foo.hpp" #include <typeinfo> extern "C" const void* libbar_foo_typeinfo() { return &typeid(foo); }
main.cpp
#include <iostream> #include <typeinfo> #include <dlfcn.h> int main() { void* libfoo = dlopen("./libfoo.so", RTLD_NOW | RTLD_LOCAL); void* libbar = dlopen("./libbar.so", RTLD_NOW | RTLD_LOCAL); auto libfoo_fn = reinterpret_cast<const void* (*)()>( dlsym(libfoo, "libfoo_foo_typeinfo")); auto libbar_fn = reinterpret_cast<const void* (*)()>( dlsym(libbar, "libbar_foo_typeinfo")); auto libfoo_ti = static_cast<const std::type_info*>(libfoo_fn()); auto libbar_ti = static_cast<const std::type_info*>(libbar_fn()); std::cout << std::boolalpha << (libfoo_ti == libbar_ti) << "\n" << (*libfoo_ti == *libbar_ti) << "\n"; return 0; }
Makefile
all: libfoo.so libbar.so main libfoo.so: foo.cpp $(CXX) -fpic -shared -Wl,-soname=$@ $^ -o $@ libbar.so: bar.cpp $(CXX) -fpic -shared -Wl,-soname=$@ $^ -L. -lfoo -o $@ main: main.cpp $(CXX) $^ -ldl -o $@
On my system, I get
$ make
...
$ ./main
false
true
This is because even though the typeinfo addresses are different, GCC's libstdc++ uses the mangled names for equality. On LLVM's libc++, for example, equality is based on the typeinfo address itself, so I get:
$ make CXX="clang++ -stdlib=libc++"
$ ./main
false
false
If I pass RTLD_GLOBAL
instead, I see
true
true
And if I edit main.cpp
to load libbar.so
first, it also works, provided I tell it where it can find libfoo.so
:
$ LD_LIBRARY_PATH=. ./main
true
true
But for the reasons described at the top of this post, neither of these is a practical workaround.
This is very similar to https://github.com/android-ndk/ndk/issues/533 but with non-dynamic types, so there's no way to add a "key function" to force the typeinfo to be a strong symbol. I happened to reproduce the problem on Android first, but it isn't Android-specific.
No, that is not possible. RTLD_LOCAL
seeks to prevent exactly that, and unfortunately must be used for System.loadLibrary
since otherwise bad things will happen if you System.loadLibrary
two libraries that each define different foo
classes.