Search code examples
linkeroperating-systemcompiler-construction

Does the Linker OR the Loader make the necessary relocations of a program?


I want to create my own linker and loader. I know that in the linking stage the linker will take into consideration the relocation data in the ELF header for all the object files.

The linker then will create an executable file with all the addresses resolved and will store it in the hard drive.

When the time comes the loader will have to load that executable in main memory but the memmory already contain running programs so there will be conflicts.

Question1: Must the loader relocate the addresses all over again?
Question2: If yes, does that mean that the loader must scan all the text sectors of the executable and change the addresses of all cpu instructions??*

*that means that the loader have a copy of the ISA in memory and must scan instruction per instruction. It's like an execution before the execution.


Solution

  • There are no relocation data in the ELF header. Linkable ELF object files store relocation data in subservient sections named .rela.text, .rela.data etc. Static linker on Linux will choose the starting address where the executable image will be loaded (usually 0x08048000) and then it uses relocations to update instructions and data in code and data sections. After those .rela.text and .rela.data have been handled, subservient .rela section are no longer needed and may be stripped off the final ELF executable file.

    When the time comes to load the linked executable file in memory, loader creates a new process in protected mode. All virtual address space is assigned to the process and it is unoccupied. Other programs may be loaded in the same computer but they run happily each in their private addressing space.

    The scenario you're afraid of sometimes happens on Windows, when different dynamic libraries were linked to start at conflicting virtual address. Therefore Portable executable format (PE/DLL) keeps relocation records in subservient section .reloc and yes, the loader must relocate all addresses mentioned in this section then.

    Similar situation is on DOS in real mode, where there is only one 1 MiB address space common for all processes. MZ executables are linked to virtual address 0 and all adresses which require relocation are kept in Relocation pointer table following the MZ EXE header, and the loader is responsible for updating segment addresses mentioned in this pointer table.

    Answer1: Relocation is necessary only if the executable image is loaded at different address that it was linked to, and if it is not linked to Position-Independed Executable.
    Answer2: Relocation does not concern addresses of all CPU instruction, only those fields in instruction body (displacement or immediate address) which refer to an address. Such places must be explicitly specified in relocation records. If the relocation information was stripped off the file, your loader should refuse execution.

    Good source of information: Linkers and Blog by Ian Lance Taylor.