Search code examples
cglibcweak-symbol

How does GNU Argp change its behavior if argp_program_bug_address exists or not?


GNU Argp (part of glibc, the GNU C library), changes its behavior based on the non-zero existence of various global variables. For example, if I define argp_program_bug_address as a non-zero value, it will be used as part of generated --help output.

Detecting the difference between zero and non-zero is straightforward. It's not as obvious how Arpg handles that I might or might not define argp_program_bug_address at all. If I do not define it, presumably a definition setting it to zero must exist in glibc. Indeed, there appears to be a defintion in argp/argp-ba.c (declaration). But if that's the case, if I define it, that would mean both I and the library defined it, resulting in link errors complaining about the multiple definition.

Weak symbols seem like a solution, but examining the glibc source, I cannot find any evidence that argp_program_bug_address or similar variables are defined weakly.

How is this working so that I can define a value that is used without causing a multiple definition error, but it also works if I never define it?


Solution

  • When you link to glibc, you're linking with (e.g.) /usr/lib64/libc.so.

    This is a linker script:

    /* GNU ld script
       Use the shared library, but some functions are only in
       the static library, so try that secondarily.  */
    OUTPUT_FORMAT(elf64-x86-64)
    GROUP ( /lib64/libc.so.6 /usr/lib64/libc_nonshared.a  AS_NEEDED ( /lib64/ld-linux-x86-64.so.2 ) )
    

    argp/argp-ba.c has only a single define of:

    const char *argp_program_bug_address;
    

    This definition goes into /usr/lib64/libc.a and is a common symbol (nm type C and readelf type COM).

    So, if you do not define it, the symbol ends up in the .bss section using the definition from libc.a:argp-ba.o.

    If you do define it, the linker uses your definition and the symbol ends up in the .data section.