Search code examples
c++g++clangshared-libraries

Difference between clang and g++ in linking a shared library where a symbol already defined in a local object file exists


I faced an issue while dealing with a C++ project that I started to work on recently. The project is organized as follows (and I can't change this):

  • The core part of the code resides in a central, shared library. I have access to the central library source files, and I can eventually modify the current compilation procedure. For sake of discussion, let's assume that the central library liba.so is created by compiling a single file a.cpp that contains the two functions void foo() and void bar().
  • Starting from this core part, multiple executables are created. Each executable resides in its own folder, and it is created by compiling a main code main.cpp and, eventually, other source codes that are also implementing some of the functions already present in the core library. For sake of discussion, let's assume that there is a file b.cpp that contains void bar().

The intended project design is to always give priority to the function defined in the local source codes. The project was tested with g++ on a Linux machine, but it shows a different behavior when compiled with clang on an Apple machine.

I know that this way of design is probably to be avoided at all, but this is not under my responsibility. I was checking if the project can also be compiled with clang

This behavior can be reproduced with the following codes:

a.h

#ifndef a_h_g 
#define a_h_g 
void foo(); 
void bar(); 
#endif 

a.cpp

#include <iostream> 
#include "a.h" 
void foo(){ 
  std::cout<<"I am foo in a.cpp"<<std::endl; 
  bar(); 
} 

void bar(){ 
  std::cout<<"I am bar in a.cpp"<<std::endl; 
} 

b.cpp

#include "a.h" 
#include <iostream> 
void bar(){ 
  std::cout<<"I AM bar in b.cpp"<<std::endl; 
} 

main.cpp

#include "a.h" 
#include <iostream> 
int main(){ 
  std::cout<<"I AM MAIN"<<std::endl; 
  foo(); 
  return 0; 
} 

and compiling as:

g++ -fPIC -shared -o liba.so a.cpp

g++ -c b.cpp

g++ -c main.cpp

g++ -o main main.o b.o -L. -la

When trying this code on a Centos7 machine with g++ version 11.2.0, the result is the one that the project was targeting:

I AM MAIN
I am foo in a.cpp
I AM bar in b.cpp

Instead, on a MAC with clang version 14.0.0, the result is:

I AM MAIN 
I am foo in a.cpp 
I am bar in a.cpp 

Is there a way to reproduce the Centos7 result with clang?


Solution

  • This difference has nothing to do with Clang or GCC, and everything to do with how shared libraries work on different platforms.

    On MacOS, the shared library prefers symbols defined in the same shared library to those defined globally (similar to how Windows DLLs work).

    On Linux, the shared library prefers global symbols, unless the library is linked with -Bsymbolic or -Bsymbolic-functions.

    From MacOS linker documentation:

    Two-level namespace
      By default all references resolved to a dynamic library record the library
      to which they were resolved. At runtime, dyld uses that information to
      directly resolve symobls. The alternative is to use the -flat_namespace
      option.  With flat namespace, the library is not recorded.
    
      At runtime, dyld will search each dynamic library in load order when
      resolving symbols. This is slower, but more like how other operating
      systems resolve symbols.
    

    You should be able to get a Linux-like behavior by adding -Wl,-flat_namespace to liba.so link line on MacOS.

    Is there a way to reproduce the Centos7 result with clang?

    Sure: just add -Wl,-Bsymbolic to liba.so link line. From Linux man ld:

    -Bsymbolic
      When creating a shared library, bind references to global symbols
      to the definition within the shared library, if any.  Normally, it is
      possible for a program linked against a shared library to override the
      definition within the shared library.  This option is only meaningful
      on ELF platforms which support shared libraries.
    

    Repeating your command:

    g++ -fPIC -shared -o liba.so a.cpp
    g++ -c b.cpp
    g++ -c main.cpp
    g++ -o main main.o b.o -L. -la -Wl,-rpath=.
    
    ./main
    I AM MAIN
    I am foo in a.cpp
    I AM bar in b.cpp
    

    Now rebuild liba.so with -Bsymbolic:

    $ g++ -fPIC -shared -o liba.so a.cpp -Wl,-Bsymbolic
    $ ./main
    I AM MAIN
    I am foo in a.cpp
    I am bar in a.cpp