Search code examples
x86elfportable-executablerelocation

Is there an ELF equivalent of PE base relocations?


I've been looking at some disassembly of some ELF binaries and I noticed this:

   0000000000401020 <_start>:
      401020:   31 ed                   xor    ebp,ebp
      401022:   49 89 d1                mov    r9,rdx
      401025:   5e                      pop    rsi
      401026:   48 89 e2                mov    rdx,rsp
      401029:   48 83 e4 f0             and    rsp,0xfffffffffffffff0
      40102d:   50                      push   rax
      40102e:   54                      push   rsp
      40102f:   49 c7 c0 30 13 40 00    mov    r8,0x401330
      401036:   48 c7 c1 d0 12 40 00    mov    rcx,0x4012d0
      40103d:   48 c7 c7 72 12 40 00    mov    rdi,0x401272
      401044:   ff 15 a6 2f 00 00       call   QWORD PTR [rip+0x2fa6]        # 403ff0 <__libc_start_main@GLIBC_2.2.5>
      40104a:   f4                      hlt    
      40104b:   0f 1f 44 00 00          nop    DWORD PTR [rax+rax*1+0x0]

When __libc_start_main gets called, we have those three immediate values passed via registers as parameters. Those are obviously function pointers that get called in __libc_start_main (including main). But these are virtual addresses, and my understanding is that the actual mapped address of the binary when it's loaded into memory and running will not necessarily be the same. So, these function pointers may not reflect their actual location in memory.

Being more acquainted with PE files, the IMAGE_DIRECTORY_BASERELOC section provides us with IMAGE_BASE_RELOCATION structures that help us adjust these constant values to reflect the new image base. But I don't see any equivalent of that for ELF files. Am I missing something here? How do these addresses get fixed when an ELF file is loaded?


Solution

  • and my understanding is that the actual mapped address of the binary when it's loaded into memory and running will not necessarily be the same.

    Nope, from those addresses we can see that this is a non-PIE ELF executable linked at ld's default base address. This is a position-dependent executable.

    The executable itself will always be loaded at a fixed virtual address, so static addresses can be put into registers using 32-bit immediates instead of RIP-relative LEA. ASLR for the executable itself is not allowed / possible.

    libc is an ELF "shared object" that can be ALSRed, hence the call to __libc_start_main via a pointer in the GOT. In gcc source for this CRT start code, this probably looks like call *__libc_start_main@GOTPCREL(%rip) (AT&T syntax).

    And BTW, we can tell this was hand-written asm, from the missed optimization of using 7-byte mov rdi, sign_extended_imm32 (same size as RIP-relative LEA) instead of 5-byte mov edi, imm32. The default non-PIE code-model in the x86-64 System V ABI puts all static code/data in the low 2GiB of virtual address space, so static addresses can be used with zero- or sign-extension to 64-bit.


    ELF "executables" that can be loaded at a randomized base address are called PIE (Position Independent Executable). In terms of ELF details, they use the same ELF "type" as shared libraries, so they are in fact ELF shared objects that have an "entry point" and are marked as executable.

    Modern Linux distros have gcc defaulting to building PIEs. See 32-bit absolute addresses no longer allowed in x86-64 Linux? (relocatable ELF shared objects can be relocated anywhere in the address space, not restricted to the low 2GiB, so there's no relocation-type for runtime fixups of 32-bit absolute addresses.)

    There is a relocation type for 64-bit absolute addresses, so jump tables (of function/code pointers) are still possible, and so is 10-byte mov rdi, imm64, but that's less efficient than a RIP-relative LEA even if it wasn't for the ELF program loader or dynamic linker having to modify the program text for these relocations.

    e.g. readelf -a /bin/ls

    ELF Header:
      Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
      Class:                             ELF64
      Data:                              2's complement, little endian
      Version:                           1 (current)
      OS/ABI:                            UNIX - System V
      ABI Version:                       0
      Type:                              DYN (Shared object file)
      Machine:                           Advanced Micro Devices X86-64
      Version:                           0x1
      Entry point address:               0x5ae0
    ...
    

    Note the Type field: DYN, the same as from an actual library like readelf -a /lib/libc.so.6. And the entry point is a relative address, relative to base address it's mapped at.

    A non-PIE executable (e.g. statically linked, or build with -fno-pie -no-pie) looks like this:

    ELF Header:
      Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
      Class:                             ELF64
      Data:                              2's complement, little endian
      Version:                           1 (current)
      OS/ABI:                            UNIX - System V
      ABI Version:                       0
      Type:                              EXEC (Executable file)
      Machine:                           Advanced Micro Devices X86-64
      Version:                           0x1
      Entry point address:               0x401000
    

    Note the Type: EXEC and the absolute entry point (chosen at link-time by ld).