Search code examples
linuxglibcelf

ELF files and additional symbols


I'm reading about ELF file format and I've noticed that a small hello world test program written in C++ contains some additional initialization in the _start symbol:

0000000000400770 <_start>:
...
      40077f:       49 c7 c0 60 09 40 00    mov    $0x400960,%r8
      400786:       48 c7 c1 f0 08 40 00    mov    $0x4008f0,%rcx
      40078d:       48 c7 c7 5d 08 40 00    mov    $0x40085d,%rdi
...

40077f is __libc_csu_fini.

4008f0 is __libc_csu_init.

40085d is main.

Shouldn't it just be _start to main? Why not? What would happen if I just removed both of the calls to 40077f and 40008f0 and replaced with nop? Basically, what is the significance of requiring libc?


Solution

  • Looking at the glibc source code:

    /* These functions are passed to __libc_start_main by the startup code.
       These get statically linked into each program.  For dynamically linked
       programs, this module will come from libc_nonshared.a and differs from
       the libc.a module in that it doesn't call the preinit array.  */
    
    
    void
    __libc_csu_init (int argc, char **argv, char **envp)
    {
      /* For dynamically linked executables the preinit array is executed by
         the dynamic linker (before initializing any shared object).  */
    
    #ifndef LIBC_NONSHARED
      /* For static executables, preinit happens right before init.  */
      {
        const size_t size = __preinit_array_end - __preinit_array_start;
        size_t i;
        for (i = 0; i < size; i++)
          (*__preinit_array_start [i]) (argc, argv, envp);
      }
    #endif
    
    #ifndef NO_INITFINI
      _init ();
    #endif
    
      const size_t size = __init_array_end - __init_array_start;
      for (size_t i = 0; i < size; i++)
          (*__init_array_start [i]) (argc, argv, envp);
    }
    
    /* This function should not be used anymore.  We run the executable's
       destructor now just like any other.  We cannot remove the function,
       though.  */
    void
    __libc_csu_fini (void)
    {
    #ifndef LIBC_NONSHARED
      size_t i = __fini_array_end - __fini_array_start;
      while (i-- > 0)
        (*__fini_array_start [i]) ();
    
    # ifndef NO_INITFINI
      _fini ();
    # endif
    #endif
    }
    

    This allow library initialization code to run. Libraries that are linked in to the program can tag functions with __attribute__((constructor)) in gcc, and this mechanism will run those functions before main, allowing libraries to initialize themselves before the program start.