According to the ld source code here in dl_main
. Without the context of what phdr
is passed in to dl_main
, I'm a bit confused on why main_map's load address is deduced by subtracting the virtual address.
The code that I've traced through:
1124 static void
1125 dl_main (const ElfW(Phdr) *phdr,
1126 ElfW(Word) phnum,
1127 ElfW(Addr) *user_entry,
1128 ElfW(auxv_t) *auxv)
1129 {
1130 const ElfW(Phdr) *ph;
1131 enum mode mode;
1132 struct link_map *main_map;
1133 size_t file_size;
1134 char *file;
1135 bool has_interp = false;
1136 unsigned int i;
1137 bool prelinked = false;
1138 bool rtld_is_main = false;
1139 void *tcbp = NULL;
... // Before this else, it thinks you're calling `ld.so.<version>` directly. This is not usually the case.
1366 else
1367 {
1368 /* Create a link_map for the executable itself.
1369 This will be what dlopen on "" returns. */
1370 main_map = _dl_new_object ((char *) "", "", lt_executable, NULL,
1371 __RTLD_OPENEXEC, LM_ID_BASE);
1372 assert (main_map != NULL);
1373 main_map->l_phdr = phdr;
1374 main_map->l_phnum = phnum;
1375 main_map->l_entry = *user_entry;
1376
1377 /* Even though the link map is not yet fully initialized we can add
1378 it to the map list since there are no possible users running yet. */
1379 _dl_add_to_namespace_list (main_map, LM_ID_BASE);
1380 assert (main_map == GL(dl_ns)[LM_ID_BASE]._ns_loaded);
1399 }
... // Loops through program headers loaded in sequence from the ELF header.
1409 for (ph = phdr; ph < &phdr[phnum]; ++ph)
1410 switch (ph->p_type)
1411 {
1412 case PT_PHDR:
1413 /* Find out the load address. */
1414 main_map->l_addr = (ElfW(Addr)) phdr - ph->p_vaddr;
1415 break;
So why is the load address subtracted here?
In particular, on line 1414, we see main_map->l_addr = (ElfW(Addr)) phdr - ph->p_vaddr;
. From the file link.h
which defines the type link_map
for main_map
, I see that link_map
is a struct that describes a loaded shared object. The l_addr
field is to describe the difference between where the .so is loaded into memory vs. where it says it is loaded in the VirtAddr
field when you run readelf
:
❯ readelf -l main
Elf file type is EXEC (Executable file)
Entry point 0x401020
There are 11 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040
0x0000000000000268 0x0000000000000268 R 0x8
INTERP 0x00000000000002a8 0x00000000004002a8 0x00000000004002a8
0x000000000000001c 0x000000000000001c R 0x1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000004c0 0x00000000000004c0 R 0x1000
...
This means that l_addr
is NOT the load address. It's actually the offset for which you need to add from the current memory to access the shared object contents.