Search code examples
cgcclinkersymbol-table

C global static variable initialization is done by linker?


Let's say we have:

f1.c

#include <stdio.h>
static int x = 10;

void f1() {
  printf("f1.c : %d\n", x);
}

main.c

extern void f1();
int main(int argc, char **argv) {
  f1();
  return 0;
}

we will compile and read the two ELF file symboltables (rel. ELF and exec ELF):

$> gcc -c *.c
$> readelf -s f1.o | grep x
      Num:    Value          Size Type    Bind   Vis      Ndx Name
        5: 0000000000000000     4 OBJECT  LOCAL  DEFAULT    3 x
$> gcc *.o
$> readelf -s a.out | grep x
      Num:    Value          Size Type    Bind   Vis      Ndx Name
       38: 0000000000601038     4 OBJECT  LOCAL  DEFAULT   25 x

I can see that the Value (also known as address) in which the global static variable x is 0000000000000000 from reading the relocatable object file f1.o.
Which means we haven't initialized it yet since it is still a rel. ELF object file and the linker will take care of this.

So my question is then, if the linker is the one to set x to the value of 10 at known address after linking of 0000000000601038, how does it do so? Where does the linker get the information to set the value to 10 and who gives this information (f1.o?) ?


Solution

  • The value 0000000000000000 (in object file f1.o) is a relative address (of the static variable), so is an offset, and that file also contains relocation directives related to it. The code for getting the argument x to print has also some relocation on it (on some load machine instruction).

    In that object file you probably have a .data section. That section should start with a word (having the 0 offset you observed in f1.o) containing 10.

    Read much more about linkers (I recommend Levine's Linkers and loaders book). The linking process (to get the ELF executable) is processing relocation directives. Read also more about the ELF format, starting with elf(5) (after having read the ELF wikipage). Study also the ABI specifications (for Linux x86-64 see here from this answer) which details possible relocation directives.

    You may want to compile your f1.c with gcc -Wall -S -fverbose-asm -O1 f1.c then look at the emitted assembler file f1.s

    You may also want to inspect the object file f1.o and the ELF executable a.out with various tools like readelf(1) and objdump(1). Both accept numerous options (notably the -r option to objdump to show relocation directives).

    Dynamic linking (of the C standard library libc.*.so) introduces some additional complexity in the ELF executable. See also ld-linux(8) (which does some linking job at start of runtime) and vdso(7). You may also want to read Drepper's How To Write Shared Libraries paper.

    The freely available textbook Operating Systems: Three Easy Pieces could also be worthwhile to read (it explains what a process is and how its execution proceeds).