Search code examples
c++gccldstatic-linking

gcc ld: method to determine link order of static libraries


My executables are linked with many static libraries, typically between 50 and 100 archives on Linux. Occasionally there are dependency cycles in these archives. The order that these libraries appear on the link command line is significant, see here. Attempting to manually order this many libraries is time-consuming at minimum, especially when there are cycles present.

Question: is there a utility or technique that will analyze a code base and produce a correct link command line ordering?


Solution

  • You want a topological sort.

    The tsort program will do that, but you'll need to do more work to use it [be prepared to write a perl/python script]. Also, there is another way as well. And, I will get to the "howto" below as I've done this sort of thing before.


    The short answer: Use --start-group liblist --end-group and be done with it.

    For a few reasons:

    An ld group is smart. It doesn't just loop on the files. It makes an initial pass through the group, but remembers the symbols. So, on subsequent passes it uses the cached symbol table information, so it's very fast.

    For complex interactions, you may not be able to get rid of all the cycles with a toposort, so you'll still need a group even if liblist has been topo sorted.

    Just how much time are we talking about? And, how much time do you think will be saved? How will you measure things to prove you really need this.


    Go for the gold

    Instead of using ld, consider using ld.gold. It has been rewritten from scratch to not use libbfd [which is slow] and operates on ELF files directly. The primary motivation for creating it was simplicity and speed.


    How to topologically sort a library list

    If we do info coreutils, the tsort section will give an example of how to toposort a symbol table.

    But, before we can get to that, we'll need to get the symbols. For a .a file, nm can provide the list: nm -go <liblist>.

    The output will look like:

    libbfd.a:
    libbfd.a:archive.o:0000000000000790 T _bfd_add_bfd_to_archive_cache
    libbfd.a:archive.o:                 U bfd_alloc
    libbfd.a:archive.o:0000000000000c20 T _bfd_append_relative_path
    libbfd.a:archive.o:                 U bfd_assert
    libbfd.a:archive.o:                 U bfd_bread
    libbfd.a:archive.o:00000000000021b0 T _bfd_bsd44_write_ar_hdr
    libbfd.a:archive.o:                 U strcpy
    libbfd.a:archive.o:                 U strlen
    libbfd.a:archive.o:                 U strncmp
    libbfd.a:archive.o:                 U strncpy
    libbfd.a:archive.o:                 U strtol
    libbfd.a:archive.o:                 U xstrdup
    libbfd.a:bfd.o:                 U __asprintf_chk
    libbfd.a:bfd.o:00000000000002b0 T _bfd_abort
    libbfd.a:bfd.o:0000000000000e40 T bfd_alt_mach_code
    libbfd.a:bfd.o:                 U bfd_arch_bits_per_address
    libbfd.a:bfd.o:0000000000000260 T bfd_assert
    libbfd.a:bfd.o:0000000000000000 D _bfd_assert_handler
    libbfd.a:bfd.o:0000000000000450 T bfd_canonicalize_reloc
    libbfd.a:bfd.o:                 U bfd_coff_get_comdat_section
    libbfd.a:bfd.o:0000000000000510 T _bfd_default_error_handler
    libbfd.a:bfd.o:0000000000000fd0 T bfd_demangle
    libbfd.a:bfd.o:                 U memcpy
    libbfd.a:bfd.o:                 U strchr
    libbfd.a:bfd.o:                 U strlen
    libbfd.a:opncls.o:0000000000000a50 T bfd_openr
    libbfd.a:opncls.o:0000000000001100 T bfd_openr_iovec
    libbfd.a:opncls.o:0000000000000b10 T bfd_openstreamr
    libbfd.a:opncls.o:0000000000000bb0 T bfd_openw
    libbfd.a:opncls.o:0000000000001240 T bfd_release
    libbfd.a:opncls.o:                 U bfd_set_section_contents
    libbfd.a:opncls.o:                 U bfd_set_section_size
    libbfd.a:opncls.o:0000000000000000 B bfd_use_reserved_id
    libbfd.a:opncls.o:00000000000010d0 T bfd_zalloc
    libbfd.a:opncls.o:00000000000011d0 T bfd_zalloc2
    
    libglib-2.0.a:
    libglib-2.0.a:libglib_2_0_la-gallocator.o:0000000000000100 T g_allocator_free
    libglib-2.0.a:libglib_2_0_la-gallocator.o:00000000000000f0 T g_allocator_new
    libglib-2.0.a:libglib_2_0_la-gallocator.o:0000000000000150 T g_blow_chunks
    libglib-2.0.a:libglib_2_0_la-gallocator.o:0000000000000160 T g_list_push_allocator
    libglib-2.0.a:libglib_2_0_la-gallocator.o:0000000000000060 T g_mem_chunk_alloc
    libglib-2.0.a:libglib_2_0_la-gallocator.o:0000000000000090 T g_mem_chunk_alloc0
    libglib-2.0.a:libglib_2_0_la-gallocator.o:0000000000000110 T g_mem_chunk_clean
    libglib-2.0.a:libglib_2_0_la-gallocator.o:0000000000000120 T g_mem_chunk_reset
    libglib-2.0.a:libglib_2_0_la-gallocator.o:00000000000001b0 T g_node_pop_allocator
    libglib-2.0.a:libglib_2_0_la-gallocator.o:00000000000001a0 T g_node_push_allocator
    libglib-2.0.a:libglib_2_0_la-gallocator.o:                 U g_return_if_fail_warning
    libglib-2.0.a:libglib_2_0_la-gallocator.o:                 U g_slice_alloc
    libglib-2.0.a:libglib_2_0_la-gallocator.o:                 U g_slice_alloc0
    libglib-2.0.a:libglib_2_0_la-gallocator.o:                 U g_slice_free1
    libglib-2.0.a:libglib_2_0_la-gallocator.o:0000000000000190 T g_slist_pop_allocator
    libglib-2.0.a:libglib_2_0_la-gslice.o:                 U g_private_get
    libglib-2.0.a:libglib_2_0_la-gslice.o:                 U g_private_set
    libglib-2.0.a:libglib_2_0_la-gslice.o:                 U g_return_if_fail_warning
    libglib-2.0.a:libglib_2_0_la-gslice.o:00000000000010d0 T g_slice_alloc
    libglib-2.0.a:libglib_2_0_la-gslice.o:0000000000001770 T g_slice_alloc0
    libglib-2.0.a:libglib_2_0_la-gslice.o:00000000000017a0 T g_slice_copy
    libglib-2.0.a:libglib_2_0_la-gslice.o:00000000000017e0 T g_slice_free1
    libglib-2.0.a:libglib_2_0_la-gslice.o:0000000000001ae0 T g_slice_free_chain_with_offset
    

    So, the syntax will be:

    <libname.a>:<objname.o>:<address> [TDB] <symbol>
    <libname.a>:<objname.o>:          U     <symbol>
    

    and we'll need to extract libname.a, symbol type (e.g. T, D, B, U), and the symbol.

    We create a list of files. In each file struct, we remember all symbols and their types. Any type that is not U [undefined symbol] will define the symbol.

    Note that as we build the symbol table, a library may have multiple U's [in various .o's] that refer to a symbol defined by another .o within it. So, we only record the symbol once and if we see a non-U type, we "promote" it (e.g. if we saw U foo and later saw T foo we change the type of foo to T [likewise for D and B].

    Now we traverse the file list (e.g. curfile). For each symbol in the file's symbol table, if it's of type U [undefined], we scan all files looking for a non-U symbol definition. If we find one (in symfile (e.g.)), we can output a dependency line for tsort: <curfile> <symfile>. We repeat this for all files and symbols.

    Note that this is bit wasteful because we could output many file dependency lines that are identical because the above will generate a line for each symbol. So, we should keep track of the lines output and only output a dependency line for unique file pairs. Also, note, it is possible to have both foo bar and bar foo. That is, actually, a cycle. While we just want one copy of foo bar and/or bar foo, they should not exclude one another.

    Okay, so now feed the output of the above to tsort and it will give us the topologically sorted version of liblist that we want.

    As should be obvious, the script parsing can take some time, so the tsort output should be cached in a file, and rebuilt in a makefile, based upon a dependency list of liblist


    Convert some .a files to .o files

    If a given library uses all [or most] of the its .o files, instead of doing ar rv libname.a ..., consider doing ld -r libname.o ....

    This is similar in approach to creating a shared library .so file, but the "big" .o can still be statically linked.

    Now, you have a single .o that will link faster than the .a because the intra-library links have already been resolved. Also, it will help with dependency cycles a bit.

    A slight extension to the topo script could tell you which libraries are good candidates for this.

    Even if the normal build makefiles can't be changed, the "final" top level could take a .a, either extract it into .o's, or use an ld force load option with -r to get the "big" .o