Search code examples
linkerjava-native-interfaceshared-libraries

Make a shared library self-contained (pre-link dependencies of a shared library into it, recursively)?


I want to package a native shared library into a Java application (so it becomes part of the JAR), to be called via JNI.

The shared library (Linux) comes from a 3rd party and has some dependencies, looking roughly like this:

$ readelf -d libpdalcpp.so.17.0.0
 
Dynamic section at offset 0xd9a4a0 contains 38 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libgdal.so.32]
 0x0000000000000001 (NEEDED)             Shared library: [libproj.so.25]
 0x0000000000000001 (NEEDED)             Shared library: [libgeotiff.so.5]
 0x0000000000000001 (NEEDED)             Shared library: [libxml2.so.2]
 0x0000000000000001 (NEEDED)             Shared library: [libz.so.1]
 ...

My problem is that I cannot (don't want to) guarantee that these dependencies will be available at runtime; the resulting Java application will run in some JRE container based on some (I don't want to care which) Linux distribution.

So what I want is to resolve these dependencies at build time, and produce from my initial shared library with dependencies on other shared libraries one without any (or with as little as possible) dependencies on other shared libraries from the system.

  • Can this be achived? Are linkers supporting this kind of operation?
  • Is the whole idea stupid? (the best answer I got so far was that the principle behind shared libraries is modularity; however, bytecode machines like the JRE (or CLR, or ...) fall into the category where shared libraries are used for plugins, and a plugin usually should be self-contained)

Solution

  • So what I want is to resolve these dependencies at build time, and convert my initial shared library with dependencies on other shared libraries into one without any

    That is impossible in practice: almost all UNIX systems consider the shared library a final link product, which can't be meaningfully modified (except in a very limited number of ways).

    Update:

    if you know a technical reason why "That is impossible"

    The short explanation for why that's impossible (or at least hard) is that the information needed to relink two DSOs into one (in particular the relocation records) has been discarded during the original DSO linking process.

    The longer explanation is that a DSO can only have a single procedure linkage table (PLT, read more on PLT here). So you would need to merge two separate PLTs into one. That sounds easy enough -- they are just flat tables.

    However, (at least part of) this merged PLT will have to be at a different address from where the original PLTs were. And you would need to update all calls into PLT with the new addresses.

    How are you going to find all the addresses that need updating? In the original link, the linker used relocation records in the individual .o files to know which address to set to which slot in a PLT. These records are now gone (AIX is the only UNIX variant I know of which retains relocation records in the linked DSO).

    The linker would have to disassemble the two DSOs, and reconstruct these relocation records. Unfortunately, disassembling variable-length instruction sets (such as x86_64) is hard. A general solution requires complete control flow analysis.