Search code examples
c++ldelf

Why does the dynamic linker *subtract* the virtual address to find out location of loaded shared library executable in memory?


According to the ld source code here in dl_main. Without the context of what phdr is passed in to dl_main, I'm a bit confused on why main_map's load address is deduced by subtracting the virtual address.

The code that I've traced through:

1124 static void
1125 dl_main (const ElfW(Phdr) *phdr,
1126          ElfW(Word) phnum,
1127          ElfW(Addr) *user_entry,
1128          ElfW(auxv_t) *auxv)
1129 {
1130   const ElfW(Phdr) *ph;
1131   enum mode mode;
1132   struct link_map *main_map;
1133   size_t file_size;
1134   char *file;
1135   bool has_interp = false;
1136   unsigned int i;
1137   bool prelinked = false;
1138   bool rtld_is_main = false;
1139   void *tcbp = NULL;
...    // Before this else, it thinks you're calling `ld.so.<version>` directly. This is not usually the case.
1366   else
1367     {
1368       /* Create a link_map for the executable itself.
1369          This will be what dlopen on "" returns.  */
1370       main_map = _dl_new_object ((char *) "", "", lt_executable, NULL,
1371                                  __RTLD_OPENEXEC, LM_ID_BASE);
1372       assert (main_map != NULL);
1373       main_map->l_phdr = phdr;
1374       main_map->l_phnum = phnum;
1375       main_map->l_entry = *user_entry;
1376 
1377       /* Even though the link map is not yet fully initialized we can add
1378          it to the map list since there are no possible users running yet.  */
1379       _dl_add_to_namespace_list (main_map, LM_ID_BASE);
1380       assert (main_map == GL(dl_ns)[LM_ID_BASE]._ns_loaded);
1399     }
...    // Loops through program headers loaded in sequence from the ELF header.
1409   for (ph = phdr; ph < &phdr[phnum]; ++ph)
1410     switch (ph->p_type)
1411       {
1412       case PT_PHDR:
1413         /* Find out the load address.  */
1414         main_map->l_addr = (ElfW(Addr)) phdr - ph->p_vaddr;
1415         break;

So why is the load address subtracted here?


Solution

  • In particular, on line 1414, we see main_map->l_addr = (ElfW(Addr)) phdr - ph->p_vaddr;. From the file link.h which defines the type link_map for main_map, I see that link_map is a struct that describes a loaded shared object. The l_addr field is to describe the difference between where the .so is loaded into memory vs. where it says it is loaded in the VirtAddr field when you run readelf:

    ❯ readelf -l main
    
    Elf file type is EXEC (Executable file)
    Entry point 0x401020
    There are 11 program headers, starting at offset 64
    
    Program Headers:
      Type           Offset             VirtAddr           PhysAddr
                     FileSiz            MemSiz              Flags  Align
      PHDR           0x0000000000000040 0x0000000000400040 0x0000000000400040
                     0x0000000000000268 0x0000000000000268  R      0x8
      INTERP         0x00000000000002a8 0x00000000004002a8 0x00000000004002a8
                     0x000000000000001c 0x000000000000001c  R      0x1
          [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
      LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                     0x00000000000004c0 0x00000000000004c0  R      0x1000
    ...
    

    This means that l_addr is NOT the load address. It's actually the offset for which you need to add from the current memory to access the shared object contents.