Search code examples
gcclinkerarchive

Build library archive with undefined references


A colleague of mine told me yesterday that building libfoo.a doesn't require all it's functions to be defined, as long as they will be if you are building an executable that links to it AND that defines that missing reference..

He said that archives are only a collection of object files with indexing, and since object files can be build with undefined references, so can archives..

Is this true ? If so, does this imply that reference resolving is ONLY performed during the linking stage (ie never at compilation or archiving) ?

Thanks a lot.. compiler is gcc by the way, language is c/c++


Solution

  • Yes, all that is perfectly true. You seem to know that libfoo.a is an ar archive. ar is the GNU general purpose archiver. It is quite as happy to archive the contents of your Documents, Pictures and/or Music folders as a collection of object files.

    External symbol resolution is linkage: it is the core business of linkage, and linkage is done only by the linker. If ar were to supposed to resolve the external symbol references of object files in an archive, then ar, like the linker, would require command options to specify the external libraries in which symbol definitions are to be searched for, and the directories in which those libraries are to be searched for. It hasn't any.

    An ar archive may be used as a linker input file. In this case the linker will search in the archive for any object files that provide definitions for unresolved symbol references that have accrued from object files already consumed. It will not care at all what other kinds of files are in the archive, with or without object files. If it finds any object files that define unresolved references, it extracts them from the archive and adds them to linkage, exactly as if they had been individually specified in the commandline and the archive not mentioned at all. So the only role of an archive in linkage is as a bag of object files from which the linker can pick ones it needs to carry on.

    If we know the right bag to offer the linker, we're spared the difficulty of knowing exactly which object files within in it the linkage will need. That's the usefulness of static libraries. In principle, any archive format might have been adopted (.tar, .gz...) But ar was first in the field, is not burdened with unwanted functionality (directory serialization, compression ...), and was history's choice. Microsoft LIB format, incidentally, is the same as ar format.

    For this role in the service of the linker, GNU ar has specialized a little for the presence of object files. The s option - which is a default, overrideable by S - adds a fake "file" to the archive with an empty filename and data that the linker is able to read as a lookup table from the global symbols defined by any object files in the archive to the names and positions of those object files. Formerly (and in non-GNU variants of ar) this kludge was applied by running a separate program, ranlib on an archive to make it accessible to the linker. The injection of a ranlib table is what enables the linker to pick the object files it needs out of the archive. Any undefined references that are brought in with these object files are for the linker to resolve as usual, from object files or libraries subsequently consumed.

    The wording of your question suggests you may be under the impression that "archiving" - e.g. creating libfoo.a - is one of processes that can be invoked, like compilation and linkage, through the GCC frontends (gcc, g++, gfortran, etc.) This isn't so. Those frontends invoke only (one or more of) a preprocessor, a compiler, an assembler and the linker. Archives are an auxiliary convenience for delivering object files to the linker and are created straightforwardly with ar:

        ar cr libfoo.a file.o...
    

    When this is done, the undefined references within libfoo.a are exactly the undefined references within file.o ....