Search code examples
cglobal-variablesshared-librariesstatic-linkingdynamic-linking

Does using shared libraries lead to having a single instance of global variables?


Suppose we have a program (executable) prog that links to libA and libB. Both libA and libB in turn link to libX, which contains a global variable.

Will the global variable have a single instance, or two different instances within the prog process in the following cases?

  1. prog is linking to libA and libB dynamically, but both of those link to libX statically. (I assume two instances?)
  2. prog is linking to libA and libB dynamically, and both of those link to libX dynamically. (I assume one instance?)
  3. prog is linking to libA and libB statically, and both of those link to libX statically. (I assume one instance again?)

Additionally:

  • Does it make a difference if the global was declared static (local to the translation unit)? In my use case, it is.
  • Does the answer to these questions differ between operating systems? (macOS, Linux, Windows)
  • Can you recommend some reading which explains the fundamentals that I need to be able to answer similar questions myself?

Solution

  • prog is linking to libA and libB is dynamically, but both of those link to libX statically. (I assume two instances?)

    In this case, the answer depends on which symbols are exported from libA.so and libB.so.

    If the variable (let's call it glob) has static linkage, then it will not be exported and you will have two separate instances.

    Likewise, if the variable doesn't have static linkage, but libX is compiled with e.g. -fvisibility-hidden, or if either libA.so or libB.so is linked with a linker script which prevents the glob from being exported, you will have two separate instances.

    However, if the variable has global linkage and its visibility is not restricted via one of the above mechanisms, then (by default) it will be exported from both libA.so and libB.so, and in that case all references to that variable will bind to whichever library is loaded first.

    Update:

    will there be two instances of that variable in memory, but just the first one is accessible, or the linker will not reserve any space at all for the second variable?

    There will be two instances in memory.

    When the linker builds libA.so, or libB.so, it has no idea what other libraries exist, and so it must reserve space in the readable and writable segment (the segment into which .data and .bss sections usually go) of the corresponding library.

    At runtime, the loader mmaps the entire segment, and thus has no chance of not reserving memory space for the variable in each library.

    But when the code references the variable at runtime, the loader will resolve all such references to the first symbol it encounters.

    Note: above is the default behavior on ELF systems. Windows DLLs behave differently, and linking libraries with -Bsymbolic may change the outcome of symbol resolution as well.

    prog is linking to libA and libB is dynamically, and both of those link to libX dynamically. (I assume one instance?)

    Correct.

    prog is linking to libA and libB is statically, and both of those link to libX statically. (I assume one instance again?)

    This is an impossible scenario: you can't link libA.a against libX.a.

    But when linking prog against libA.a, libB.a and libX.a, yes: you will end up with one instance of glob.