Search code examples
linuxgccinitializationshared-librariesstatic-initialization

initialization inside of an shared library (.so)


I'm porting a bigger project to Linux (originally Windows) and I'm painfully missing a counterpart of Windows DllMain(). DllMain(DLL_PROCESS_ATTACH) is called after C runtime (CRT) init is done, so all global statics (independent of translation unit) are already initialised. You can do more complex stuff here before main() is called, resp. LoadLibrary / dlopen returns. I am trying to achieve a similar behavior on Linux. I evaluated GCCs __attribute__((init_priority)) and __attribute__((constructor)) but without success as it seems not possible to specify a priority behind the default of 0xFFFF - so you don't get a function which is called after dynamic initialisation of your statics. (And marking all statics except the chosen one with a lower number is just not practicable).

Last hope is ELF init/fini which brings me to a level I've never been before. Is the CRT init/deinit done via init/fini ? If I register my own init/fini functions, I guess the default ones which init/deinit the CRT are overwritten, as ,afaik, there is no chain of init/fini functions. If this is right, I need to init/deinit the CRT by myself, but how do I do this? Or am I totally on the wrong track?

Edit, better explanation of the idea to (mis)use a ctor/dtor of an global object to do init/deinit:

struct InitDeinit {
  InitDeinit(void) { ... init stuff }
  ~InitDeinit() { ... deinit stuff }
};
namespace {
  InitDeinit __attribute__ ((init_priority (65535))) tco;  // tco=TheChosenOne
}

A higher init_priority number means later initialisation but the default (if no __atribute__ init_priority specified) is the max value of 65365 so it is not possible to define an object which is constructed after the "normal" ones (across translation units).

Edit2

I found an interesting article regarding elf init/fini: https://maskray.me/blog/2021-11-07-init-ctors-init-array

I tried

static void init(int argc, char **argv, char **envp)
 { puts(__FUNCTION__); }
__attribute__((section(".init_array"), used)) static typeof(init) *init_p = init;

And my function got called !, but unfortunately only as last one in this TU,I don't know how to put it at the end of .init_array so it gets called as last of the complete library.


Solution

  • You want to be able to code a GNU/Linux shared library in such a way that a library initialisation function is called when the shared library is loaded after the static initialisation of all namespace scope objects in all translation units that contribute to the shared library.

    Presumably your motivation is the experience of a Static Initialisation Order fiasco (SIOF), in a naive attempt at the library initialisation function, where you found that one of your statically initialised objects had not been initialised when that function used it. Or maybe it's the dread of that happening.

    The inevitable rejoinder.

    The canonical way to head off SIOFs is to refactor the code so that all static objects are constructed at first use. That's would be a productive investment in the code's quality and your expertise.

    Let's assume you need a cheap solution in the short term.

    I can show you a cheap one that has decent chance of working. But it's a hack, and although it disarms the type SIOF in view, it does not disarm a type of Static Finalisation Order Fiasco (SFOF) that is not yet in view. You need to disarm that type of SFOF by other means, which I'll show.1 Happily those means are cheap, and not a hack.

    A shared library that needs an initialisation function (call it init) likely also needs a finalisation function (call it finis). If a statically initialised object at namespace scope (NSSIO, for short) needs to be initialised before init runs, that's because init refers to it and needs it constructed already. Likewise, in case finis refers to any NSSIO, that object must not yet have been destroyed when finis runs. So in general, you need the shared library's NSSIO constructors to be run before init, and its NSSIO destructors run after finis. Failing to secure the latter requirement puts you in jeopardy of an SFOF. init, finis and the NSSIOs must jointly conform to the principle: destructors in reverse order of constructors.

    The Two Fiascos

    Let's see both these fiascos in action when we naively try to equip a little shared library with initialisation and finalisation functions. Of course the use of an uninitialised object leads to undefined behaviour, so where I get a fiasco you might not, or only on Friday 13th, etc.

    I'm using:

    $ g++ --version
    g++ (Ubuntu 13.2.0-23ubuntu4) 13.2.0
    ...
    $ ld --version
    GNU ld (GNU Binutils for Ubuntu) 2.42
    ...
    

    SIOF

    Source files for shared library:

    $ cat ask_v0.cpp 
    #include "simon.hpp"
    #include "garfunkel.hpp"
    #include <iostream>
    #include <string>
    
    namespace {
        
    std::string duo;
    
    [[gnu::destructor]] void finis() {
        std::cout <<  __PRETTY_FUNCTION__ << std::endl;
        std::cout << "Do you mean \"" << duo << 
        "'s Greatest Hits\"?" << std::endl;
    }
    
    [[gnu::constructor]] void init() {
        std::cout <<  __PRETTY_FUNCTION__ << std::endl;
        duo = simons_name() + " and " + garfunkels_name();
    }
    
    } // anon. ns.
    
    void ask() {
        std::cout << "Have you got \"Greatest hits of " 
        << duo << "\"?" << std::endl;
    }
    
    
    $ cat simon_v0.cpp 
    #include "simon.hpp"
    #include <string>
    
    namespace { std::string const simon{"Simon"}; }
    
    std::string const & simons_name(void){  return simon; }
    
    
    $ cat garfunkel_v0.cpp 
    #include "garfunkel.hpp"
    #include <string>
    
    namespace { std::string const garfunkel{"Garfunkel"}; }
    
    std::string const & garfunkels_name(void) { return garfunkel; }
    

    Headers:

    $ tail -n +1 *.hpp
    ==> garfunkel.hpp <==
    #pragma once
    #include <string>
    std::string const & garfunkels_name();
    
    ==> simon.hpp <==
    #pragma once
    #include <string>
    std::string const & simons_name();
    

    Build shared library libask.so:

    $ g++ -shared -fPIC -o libask.so ask_v0.cpp simon_v0.cpp garfunkel_v0.cpp
    

    And a program that links against libask.so:

    $ cat main.cpp 
    void ask();
    
    int main()
    {
        ask();
        return 0;
    }
    
    $ g++ -o ask main.cpp -L. -lask -Wl,-rpath=$(pwd)
    

    And:

    $ ./ask
    void {anonymous}::init()
    Segmentation fault (core dumped)
    

    There is the SIOF, in libask's constructor init(), attempting to copy the strings simon and garfunkel into duo before some or all of them are initialised.

    One way of disarming the SIOF

    This is one you've considered and rejected as not cheap enough for you, or maybe too error-prone. But it's easy with my toy libask and it will serve to flush out the SFOF.

    Besides the constructor and destructor function attributes g++ provides the analogous init_priorty(NNNNN) attribute of namespace scope objects, e.g.

    [[init_priority(NNNNN)]] std::string duo;
    

    where NNNNN is an unsigned integer in the range 101 - 65535 (0xFFFF) that confers a relative initialisation priority on the NSSIO to which it is applied, and a reversed finalisation priority too. The range 0 - 100 is reserved to the compiler. NNNNN = 65535 is redundant because that is the default value, and the value implied for NSSIOs that are not declared with any init_priority(NNNNN) (the usual case). If 65535 is specified, the compiler will discard it to let the default apply. For initialisations, NSSIOs will be initialised in ascending order of the specified or default NNNNN. Thus 101 is maximum priority, 65535 is minimum. For finalisations the reverse is true. This enforces destructors in reverse order of constructors for prioritised NSSIOs.

    Furthermore, the constructor and destructor attributes have explicitly prioritised variants:

    [[gnu::constructor(NNNNN)]]
    [[gnu::destructor(NNNNN)]]
    

    with the same range of NNNNN, and in the variants lacking (NNNNN) the default 65535 is implied.

    A constructor function is bound into the same prioritised call sequencing as the constructors of NSSIOs, and destructor functions are priortised in reverse order of constructors per their specified or default NNNNN.

    Within a given specified or default NNNNN the calling sequence of is unspecified for constructor, destructor or init_priority.

    With this knowledge, we can disarm the SIOF by assigned an init_priority(NNNNN) to the NSSIOs with NNNNN < the constructor/destructor priority. Let's say we reserve some priority headroom by giving init and finis priority 32768 (0x8000) and give the NSSIOs init_priority 16384 (0x4000) I apply these changes in *_v1.cpp versions of the *_v0.cpp source files, giving:

    $ tail -n +1 *v1.cpp
    ==> ask_v1.cpp <==
    #include "simon.hpp"
    #include "garfunkel.hpp"
    #include <iostream>
    #include <string>
    
    namespace {
        
    [[gnu::init_priority(16384)]] std::string duo;
    
    [[gnu::destructor(32768)]] void finis() {
        std::cout <<  __PRETTY_FUNCTION__ << std::endl;
        std::cout << "Do you mean \"" << duo << 
        "'s Greatest Hits\"?" << std::endl;
    }
    
    [[gnu::constructor(32768)]] void init() {
        std::cout <<  __PRETTY_FUNCTION__ << std::endl;
        duo = simons_name() + " and " + garfunkels_name();
    }
    
    } // anon. ns.
    
    void ask() {
        std::cout << "Have you got \"Greatest hits of " 
        << duo << "\"?" << std::endl;
    }
    
    ==> garfunkel_v1.cpp <==
    #include "garfunkel.hpp"
    #include <string>
    
    namespace { [[gnu::init_priority(16384)]] std::string const garfunkel{"Garfunkel"}; }
    
    std::string const & garfunkels_name(void) { return garfunkel; }
    
    ==> simon_v1.cpp <==
    #include "simon.hpp"
    #include <string>
    
    namespace { [[gnu::init_priority(16384)]] std::string const simon{"Simon"}; }
    
    std::string const & simons_name(void){  return simon; }
    

    Rebuild and reload the shared libary (this time keeping the object files):

    $ g++ -c -fPIC ask_v1.cpp simon_v1.cpp garfunkel_v1.cpp
    $ g++ -shared -fPIC -o libask.so ask_v1.o simon_v1.o garfunkel_v1.o
    $ ./ask
    void {anonymous}::init()
    Have you got "Greatest hits of Simon and Garfunkel"?
    void {anonymous}::finis()
    Do you mean "�V��t�Xۻkel's Greatest Hits"?
    

    We've fixed the SIOF, but exposed an SFOF. Not a seg-fault on this occasion, just garbage in the prematurely destroyed string.

    SFOF

    What happened there?

    I'll go into this more deeply than I need to because the details pay off later: they will explain the Cheap Hack.

    The SFOF flows from the way in which g++ assembles the calling of constructor(.NNNNN) and destructor(.NNNNN) functions in a translation unit versus the way it assembles the construction and destruction of the NSSIOs, plus what the linker does with the resulting object files.

    • G++, Assembling constructor[(.NNNNN)] and destructor[(.NNNNN])] calls

    If any constructor functions with default priority are defined, the compiler appends their addresses in order of definition to its output .init_array section in the object file. Likewise any destructor functions are appended to its output .fini_array section.

    If any constructor(NNNNN) functions are defined, the compiler emits an init_array.NNNNN section listing the addresses of these functions in order of definition. Likewise any destructor(NNNNN) functions are appended to an output .fini_array.NNNNN section.

    • G++, Assembling NSSIO construction and destruction

    This is quite different. If there are NSSIOs with default init_priority, a single private function is assembled to set up the construction and destruction of all of them. The name of this function is ad hoc per translation unit and differs from one to another, so I'll just call it static_ctor_dtor_factotem. If there are NSSIOs with non-default init_priority NNNNN, a similar static_ctor_dtor_factotem_NNNNN is assembled.

    The address of static_ctor_dtor_factotem[_NNNNN] is appended to section .init_array[.NNNNN].

    The payload of static_ctor_dtor_factotem[_NNNNN] is this: For each of the NSSIOs that it serves, in definition order:

    • Call the appropriate constructor of that NSSIO.
    • Then call __cxa_atexit, giving it the address of the destructor of that NSSIO and the dso_handle of the load image that defines it. This will append that destructor to the GNU C library's per-image list of at-exit functions registered for calling, in reverse order of registration, when the registering DSO (program or shared library) makes its finale.

    In contrast with the destructor[.(NNNNN)] behaviour, nothing is appended to any .fini_array[.NNNNN] section for an NSSIO's destructor.

    • What the linker does with that

    The linker consumes all the sections from input object files and merges them into output sections in the linked binary according the rules encoded in its configured or specified linker script. Per the stock GNU/Linux configuration, it merges input .init_array[.NNNNN] and .fini_array[.NNNNN] sections as follows:

    • All input .init_array.NNNNN sections are sorted in ascending order of NNNNN into the output .init_array section.

    • Then all input .init_array sections are just appended to the output .init_array section as they come.

    • The analogous thing is done to merge the input .fini_array[.NNNNN] sections into the output .fini_array section.

    The result in the linked binary is that:-

    The output init_array section contains the addresses of all the input constructor[(NNNNN)] functions mingled with the those of the input static_ctor_dtor_factotem[_NNNNN] functions, all sorted in ascending NNNNN order (because all those at default priority = 65535 were filled in at the end). The presence of the static_ctor_dtor_factotem[_NNNNN] functions makes it as if all the NSSIO constructors were individually entered in the output .init_array within their initialisation priority bands, paired with their respective at-exit destructor registrations.

    The output .fini_array section contains the addresses of all the input destructor[(NNNNN)] functions, also sorted in ascending NNNNN order. But the .fini_array does not contain the addresses of any input static_ctor_dtor_factotem[_NNNNN] functions because, those static_ctor_dtor_factotem[_NNNNN] functions will instead register the NSSIO destructors in the per-image at-exit list when they are called in initialisation. There's nothing they can do in a .fini_array[.NNNNN]

    In either section, within a given NNNNN band (including default), the order of addresses is effectively arbitrary, because it depends on the order in which the linker consumed input sections, which depends on the order in which object files were linked.

    The asymmetry between the compositions of the .init_array and .fini_array in a linked binary creates the opening for the SFOF we observed.

    • The runtime result

    In runtime initialisation of an image, the .init_array functions are called in index order. In finalisation, the .finis_array functions are called in reverse index order. Prima facie that seems to enforce destructors in reverse order of constructors. But actually it doesn't, because the NSSIO destructors are not in the .fini_array, they're in the GNU C runtime's at-exit list.

    The protocol ensures that destructor functions will run with the reverse priority of the constructor functions, and NSSIO destructors will run in the reverse priority order of the NSSIO constructors. But the sequence of .fini_array calls is not somehow interlinked with that of the at-exit list. To ensure, for example, that libask's finis destructor will run after the NSSIO destructor of the std::string duo that is referenced in finis, we'd need at least a guarantee that the at-exit list will run after the .fini_array list, and the GNU C runtime does not give us that (or the opposite: in fact, it may interleave them).

    What we just saw with our last build of libask.so is that duo got destoyed before finis was run.

    See what the object files have got by way of (.init|.fini) sections:

    $  readelf --section-details --wide *v1.o | egrep \(File:\|\init_array\|fini_array\)
    File: ask_v1.o
      [19] .fini_array.32768
      [20] .rela.fini_array.32768
      [22] .init_array.32768
      [23] .rela.init_array.32768
      [35] .init_array.16384
      [36] .rela.init_array.16384
    File: garfunkel_v1.o
      [36] .init_array.16384
      [37] .rela.init_array.16384
    File: simon_v1.o
      [36] .init_array.16384
      [37] .rela.init_array.16384
    

    ask_v1.o has got both, and that is a suspect input to C++ library, because it might harbour an SFOF (as it does).

    Disarming the SFOF

    There are two ways in terms of source code, but in principle - and under the hood - they come to the same thing:

    • Using free-standing initialisation and finalisation functions

    That's what we've been doing. What we also need to do is mimic the compiler's protocol for setting up NSSIO construction and destruction. We don't do the like of this:

    ...
    [[gnu::destructor(32768)]] void finis() {
        std::cout <<  __PRETTY_FUNCTION__ << std::endl;
        std::cout << "You mean \"" << duo << 
        "'s Greatest Hits\"?" << std::endl;
    }
    
    [[gnu::constructor(32768)]] void init() {
        std::cout <<  __PRETTY_FUNCTION__ << std::endl;
        duo = simons_name() + "  and " + garfunkels_name();
    }
    ...
    

    which just gave our SFOF. Instead we do the like of this, in ask_v2.cpp:

    $ cat ask_v2.cpp 
    #include "simon.hpp"
    #include "garfunkel.hpp"
    #include <iostream>
    #include <string>
    #include <cstdlib>
    
    namespace {
        
    [[gnu::init_priority(16384)]] std::string duo;
    
    void finis() {
        std::cout <<  __PRETTY_FUNCTION__ << std::endl;
        std::cout << "Do you mean \"" << duo << 
        "'s Greatest Hits\"?" << std::endl;
    }
    
    [[gnu::constructor(32768)]] void init() {
        std::cout <<  __PRETTY_FUNCTION__ << std::endl;
        duo = simons_name() + " and " + garfunkels_name();
        atexit(finis);
    }
    
    } // anon. ns.
    
    void ask() {
        std::cout << "Have you got \"Greatest hits of " 
        << duo << "\"?" << std::endl;
    }
    

    No more destructor specification. After it has succeeded in doing everything else, init registers finis as an at-exit destructor in sequence with the at-exit destructor registrations of the NSSIOs defined before and after finis. It's registered after the destructor of duo, so it will be run before the destructor of duo. (It is OK to call the standard atexit rather than __cxa_atexit because the GNU C library's atexit delegates to __cxa_atexit).

    Relink and reload libask.so again:

    $ g++ -shared -fPIC -o libask.so ask_v2.cpp simon_v1.cpp garfunkel_v1.cpp
    $ ./ask
    void init()
    Have you got "Greatest hits of Simon and Garfunkel"?
    void finis()
    Do you mean "Simon and Garfunkel's Greatest Hits"?
    

    and now it's all good.

    • Not using free-standing initialisation and finalisation functions

    This is easy and you had the right idea yourself. Replace free functions init and finis with a single NSSIO, say init_finis, of class type say InitFinis, with an init_priority later than other NSSIOs, with a constructor that does the job of init and a destructor that does that of finis.

    InitFinis would be an obvious candidate to work up into an encapsulating owner of all the library's statically initialised resources; more robust still, to have no statically initialised resources so that init_finis itself is the only NSSIO the library needs. Either way, you'd have no further use for...

    The Cheap Hack

    This is dirty because we'll invoke the linker twice and do some surgery on an intermediate binary with objcopy. We could get by with a single linkage, but only at the price of doing an objcopy operation on every object file. And of course it's dirty because it vitally depends on the unstandardised implementation details of g++ and the linker that we've noted.

    You'll need to execute this solution as a new build step after compilation and before final linkage until such time as you have refactored the code conforming with construct-at-first-use.

    Let's take a backward step to start with:

    $ sed -E 's/(.* (std::string duo))/\2/' ask_v2.cpp > ask_v3.cpp
    $ diff ask_v2.cpp ask_v3.cpp
    9c9
    < [[gnu::init_priority(16384)]] std::string duo;
    ---
    > std::string duo;
    

    ask_v3.cpp just strips the [[gnu::init_priority(16384)]] from std::string duo. Then:

    $ g++ -c -fPIC ask_v3.cpp simon_v0.cpp garfunkel_v0.cpp
    $ g++ -shared -o libask.so ask_v3.o simon_v0.o garfunkel_v0.o
    

    In this build of libask.so, using ask_v3.cpp with the other v0 sources, all the NSSIOs are reset to default priority while init is still [[gnu::constructor(32768)]] and still at-exit registers finis. The library is still SFOF-secure but no longer SIOF-secure, and:

    $ ./ask
    void {anonymous}::init()
    Segmentation fault (core dumped)
    

    Let's look again at (*.init|*.fini)_array[.NNNNN] sections of the object files:

    $ readelf --section-details --wide *v3.o *v0.o | egrep \(File:\|\init_array\|fini_array\)
    File: ask_v3.o
      [20] .init_array.32768
      [21] .rela.init_array.32768
      [33] .init_array
      [34] .rela.init_array
    File: garfunkel_v0.o
      [36] .init_array
      [37] .rela.init_array
    File: simon_v0.o
      [36] .init_array
      [37] .rela.init_array
      
    

    Now there are no .fini_array[.NNNNN] sections, because all my finalisations are sequenced SFOF-securely through the at-exit list. I just need to recover SIOF security.

    From the details I already laid out in over-explaining the source of the SFOF, you now know how the linker merges the .init_array[.32768] sections into the .init_array of libask.so. The .32768-suffixed sections will be sorted in before the unsuffixed ones, meaning that in runtime initialisation the [[gnu::constructor(32768)]] will run before the constructors of the NSSIOs, which have default priority - so the SIOF fires.

    The fact that we have init registered in the .init_array.32768 of ask_v3.o leaves all my NSSIO initialisations, and only them, registered in the other .init_arrays.

    Suppose we rename those .init_array sections to .init_array.16384? That would have the linktime effect of restoring [[gnu::init_priority(16384)]] to all the NSSIOs in the libary, and restoring its SIOF security.

    What about those .rela.init_array[.32768] sections, which haven't come up? They're the relocation tables for the respective sections without the .rela prefix. For consistency, best likewise rename the .rela.init_array ones to .rela.init_array.16384? It turns out we don't have to bother, because objcopy will throw it in.

    libask.so links only 3 object files, but it might be any number. We can nevertheless stick to manipulating only one object file that we prepare like this:

    $ ld -r -o libask.o ask_v3.o simon_v0.o garfunkel_v0.o
    

    invoking the linker ld directly. This performs a partial (or incremental) linkage of the input object files into the relocatable object file libask.o. Undefined references are tolerated (because they're fine in an object file) and input sections sect[.suffix] are simply merged in the output section of the same name with no collapsing of input sect[.suffix] into output sect. As we can see:

    $ readelf --section-details --wide libask.o | egrep \(init_array\|fini_array\)
      [49] .init_array.32768
      [50] .rela.init_array.32768
      [51] .init_array
      [52] .rela.init_array
      
    

    Now let's do the renaming, outputting libask_hacked.o:

    $ objcopy --rename-section .init_array=.init_array.16384 libask.o libask_hacked.o
    

    And check it worked:

    $ readelf --section-details --wide libask_hacked.o | egrep \(init_array\|fini_array\)
      [49] .init_array.32768
      [50] .rela.init_array.32768
      [51] .init_array.16384
      [52] .rela.init_array.16384
      
    

    objcopy was smart enough to mirror the renaming of .init_array to rela.init_array.

    Now if we relink libask.so with libask_hacked.o as the sole object file:

    $ g++ -shared -o libask.so libask_hacked.o
    

    it will be initialise in the order:

    • NSSIO constructors
    • init()

    and finalise in the order:

    • finis()
    • NSSIO destructors

    as we want;

    $ ./ask
    void {anonymous}::init()
    Have you got "Greatest hits of Simon and Garfunkel"?
    void {anonymous}::finis()
    Do you mean "Simon and Garfunkel's Greatest Hits"?
    

    Once again good.

    With this hack in reserve, the only source code changes we needed to make from the first cut were in the source file defining init and finis, and they affected no NSSIOs even there. They were:

    • To make init be a constructor(NNNNN) for some 100 < NNNNN < 65535.
    • To remove any destructor[(NNNNN)] attribute from finis and instead make init register finis in the at-exit list when it has done everything else.

    1. I wouldn't strenously contest the view that SIOF/SFOF should be reserved for fiascos arising from the relative order of static initialisations/finalisations, rather that ones arising more broadly from the relative order of things being executed that include static initialisations/finalisations. But for the present purpose I'll take the broader meanings.