Search code examples
linuxshared-librarieslibtool

Linux shared library versioning for backwards compatibility using libtool


I maintain a shared library that uses libtool, runs (mostly) on Linux and spits out the following files.

lrwxrwxrwx  1 root root    18 jun 10 16:12 libxxx.so -> libxxx.so.0.0.1
lrwxrwxrwx  1 root root    18 jun 10 16:12 libxxx.so.0 -> libxxx.so.0.0.1
-rwxr-xr-x  1 root root  760K jun 10 16:12 libxxx.so.0.0.1

libtool version-info is currently 0:1:0

I want to add functionality to the library's API/ABI, not remove or modify any existing API/ABI so that:

  • I produce a library that can still be used by binaries built against the older versions of the library. So the new library acts as a drop-in replacement, no rebuilding of old binaries is needed.
  • Binaries built against the new library and using the new API/ABI fail at the loading stage when the library containing the new API is not found.

How can I achieve this with libtool?

I tried setting version-info to 1:0:1 as suggested here

Programs using the previous version may use the new version as drop-in replacement, but programs using the new version may use APIs not present in the previous one. In other words, a program linking against the new version may fail with “unresolved symbols” if linking against the old version at runtime: set revision to 0, bump current and age.

This results in the following files:

rwxrwxrwx  1 root root    18 jun 10 16:24 libxxx.so -> libxxx.so.0.1.0
lrwxrwxrwx  1 root root    18 jun 10 16:24 libxxx.so.0 -> libxxx.so.0.1.0
-rwxr-xr-x  1 root root  760K jun 10 16:24 libxxx.so.0.1.0

However, binaries built against the new library will load and will then fail during runtime with an undefined symbol error if at runtime they enter a codepath that includes one of the new symbols not present in the old library.

I can increase the SONAME to libxxx.so.1 but then I break the binaries built against the older version, while the new version is still compatible.


Solution

  • However, binaries built against the new library will load and will then fail during runtime with an undefined symbol error if at runtime they enter a codepath that includes one of the new symbols not present in the old library.

    TL;DR: I don't think there is a solution that will achieve desired result right now (unless you are already using versioned symbols), but you can make it a bit better now, and can fix it completely for the next time.


    This is a problem that is that solved by the GNU symbol version extension.

    It's probably best to have an example. Initial setup:

    // foo_v1.c
    int foo() { return 42; }
    
    // main_v1.c
    int main() { return foo(); }
    
    gcc -fPIC -shared -o foo.so foo_v1.c
    gcc -w main_v1.c ./foo.so -o main_v1
    
    ./main_v1; echo $?
    42
    

    Now let's modify foo.c so it's a drop-in replacement, but has new function:

    mv foo.so foo.so.v1
    
    // foo_v2.c
    int foo() { return 42; }
    int bar() { return 24; }
    
    gcc -fPIC -shared -o foo.so foo_v2.c
    
    ./main_v1; echo $?
    42
    

    Everything still works (as expected). Now let's build main_v2 which requires new function.

    // main_v2.c
    int main() { return foo() - bar(); }
    
    gcc -w main_v2.c ./foo.so -o main_v2
    
    ./main_v2; echo $?
    18
    

    Everything still works. And now we break things:

    mv foo.so foo.so.v2
    cp foo.so.v1 foo.so
    ./main_v1; echo $?
    42
    ./main_v2
    ./main_v2: symbol lookup error: ./main_v2: undefined symbol: bar
    

    Voila: we have a failure at runtime, instead of (desired) failure at load time. (This isn't actually visible in the above output, but can trivially verified by adding e.g. printf in main.)

    Solution: Let's give foo.so a version script:

    // foo.lds
    FOO_v2 {
      global: bar;
    };
    
    gcc -fPIC -shared -o foo.so foo_v2.c -Wl,--version-script=foo.lds
    gcc -w main_v2.c ./foo.so -o main_v2a
    
    ./main_v1; echo $?
    42
    ./main_v2a; echo $?
    18
    

    As we can see, everything still works with the new version of the library.

    But what happens when we run the new main_v2a against old foo.so?

    mv foo.so foo.so.v2
    cp foo.so.v1 foo.so
    ./main_v2
    
    ./main_v2a: ./foo.so: no version information available (required by ./main_v2a)
    ./main_v2a: symbol lookup error: ./main_v2a: undefined symbol: bar, version FOO_v2
    

    This is little bit better: the failure is still happening at runtime, but the loader does mention FOO_v2, providing a hint that this is caused by some kind of "version too old" problem.

    The reason the loader doesn't fail at load time is that the (old) foo.so has no version info whatsoever.

    If you now repeat this process, creating FOO_v3 with a new function, and try to run main_v3 against the foo.so.v2 version of the library, you would get a failure at load time:

    // foo_v3.c
    int foo() { return 42; }
    int bar() { return 24; }
    int baz() { return 12; }
    
    // foo_v3.lds
    FOO_v2 {
      global: bar;
    };
    FOO_v3 {
      global: baz;
    } FOO_v2;
    
    // main_v3.c
    int main() { return foo() - bar() - baz(); }
    
    gcc -fPIC -shared -o foo.so foo_v3.c -Wl,--version-script=foo_v3.lds
    gcc -w main_v3.c ./foo.so -o main_v3
    
    ./main_v3; echo $?
    6
    

    Now let's run main_v3 against foo.so.v2:

    cp foo.so.v2 foo.so
    
    ./main_v3
    ./main_v3: ./foo.so: version `FOO_v3' not found (required by ./main_v3)
    

    This time, the failure happens at load time. QED.