Search code examples

How to find global static initializations

I just read this excellent article: and then I tried:

What it says about finding initializers does not work for me though. The .ctors section is not available, but I could find .init_array (see also Can't find .dtors and .ctors in binary). But how do I interpret the output? I mean, summing up the size of the pages can also be handled by the size command and its .bss column - or am I missing something?

Furthermore, nm does not report any *_GLOBAL__I_* symbols, only *_GLOBAL__N_* functions, and - more interesting - _GLOBAL__sub_I_somefile.cpp entries. The latter probably indicates files with global initialization. But can I somehow get a list of constructors that are being run? Ideally, a tool would give me a list of

Foo::Foo in file1.cpp:12
Bar::Bar in file2.cpp:45

(assuming I have debug symbols available). Is there such a tool? If not, how could one write it? Does the .init_array section contain pointers to code which could be translated via some DWARF magic to the above?


  • As you already observed, the implementation details of contructors/initialization functions are highly compiler (version) dependent. While I am not aware of a tool for this, what current GCC/clang versions do is simple enough to let a small script do the job: .init_array is just a list of entry points. objdump -s can be used to load the list, and nm to lookup the symbol names. Here's a Python script that does that. It should work for any binary that was generated by the said compilers:

    #!/usr/bin/env python
    import os
    import sys
    # Load .init_array section
    objdump_output = os.popen("objdump -s '%s' -j .init_array" % (sys.argv[1].replace("'", r"\'"),)).read()
    is_64bit = "x86-64" in objdump_output
    init_array = objdump_output[objdump_output.find("Contents of section .init_array:") + 33:]
    initializers = []
    for line in init_array.split("\n"):
        parts = line.split()
        if not parts:
        parts.pop(0)  # Remove offset
        parts.pop(-1) # Remove ascii representation
        if is_64bit:
            # 64bit pointers are 8 bytes long
            parts = [ "".join(parts[i:i+2]) for i in range(0, len(parts), 2) ]
        # Fix endianess
        parts = [ "".join(reversed([ x[i:i+2] for i in range(0, len(x), 2) ])) for x in parts ]
        initializers += parts
    # Load disassembly for c++ constructors
    dis_output = os.popen("objdump -d '%s' | c++filt" % (sys.argv[1].replace("'", r"\'"), )).read()
    def find_associated_constructor(disassembly, symbol):
        # Find associated __static_initialization function
        loc = disassembly.find("<%s>" % symbol)
        if loc < 0:
            return False
        loc = disassembly.find(" <", loc)
        if loc < 0:
            return False
        symbol = disassembly[loc+2:disassembly.find("\n", loc)][:-1]
        if symbol[:23] != "__static_initialization":
            return False
        address = disassembly[disassembly.rfind(" ", 0, loc)+1:loc]
        loc = disassembly.find("%s <%s>" % (address, symbol))
        if loc < 0:
            return False
        # Find all callq's in that function
        end_of_function = disassembly.find("\n\n", loc)
        symbols = []
        while loc < end_of_function:
            loc = disassembly.find("callq", loc)
            if loc < 0 or loc > end_of_function:
            loc = disassembly.find("<", loc)
            symbols.append(disassembly[loc+1:disassembly.find("\n", loc)][:-1])
        return symbols
    # Load symbol names, if available
    nm_output = os.popen("nm '%s'" % (sys.argv[1].replace("'", r"\'"), )).read()
    nm_symbols = {}
    for line in nm_output.split("\n"):
        parts = line.split()
        if not parts:
        nm_symbols[parts[0]] = parts[-1]
    # Output a list of initializers
    for initializer in initializers:
        symbol = nm_symbols[initializer] if initializer in nm_symbols else "???"
        constructor = find_associated_constructor(dis_output, symbol)
        if constructor:
            for function in constructor:
                print("%s %s -> %s" % (initializer, symbol, function))
            print("%s %s" % (initializer, symbol))

    C++ static initializers are not called directly, but through two generated functions, _GLOBAL__sub_I_.. and __static_initialization... The script uses the disassembly of those functions to get the name of the actual constructor. You'll need the c++filt tool to unmangle the names, or remove the call from the script to see the raw symbol name.

    Shared libraries can have their own initializer lists, which would not be displayed by this script. The situation is slightly more complicated there: For non-static initializers, the .init_array gets an all-zero entry that is overwritten with the final address of the initializer when loading the library. So this script would output an address with all zeros.