Search code examples
linuxassemblyelf

In ELF, why do the headers need to be in one segment?


I have made this simple ELF for learning purposes:

bits 64
org 0x08048000

elfHeader:
    db  0x7F, "ELF", 2, 1, 1, 0   ; e_ident
    db 0                            ; abi version
    times 7 db 0                    ; unused padding
    dw  2                         ; e_type
    dw  62                        ; e_machine
    dd  1                         ; e_version
    dq  _start                    ; e_entry
    dq  programHeader - $$        ; e_phoff
    dq  0                         ; e_shoff
    dd  0                         ; e_flags
    dw  elfHeaderSize             ; e_ehsize
    dw  programHeaderSize         ; e_phentsize
    dw  1                         ; e_phnum
    dw  0                         ; e_shentsize
    dw  0                         ; e_shnum
    dw  0                         ; e_shstrndx

elfHeaderSize  equ $ - elfHeader

programHeader:
    dd  1                         ; p_type
    dd  7                         ; p_flags
    dq  0                         ; p_offset
    dq  $$                        ; p_vaddr
    dq  $$                        ; p_paddr
    dq  fileSize                  ; p_filesz
    dq  fileSize                  ; p_memsz
    dq  0x1000                    ; p_align

programHeaderSize equ  $ - programHeader

_start:
   xor rdi, rdi
   xor eax,eax
   mov al,60
   syscall

fileSize      equ     $ - $$

In order to compile that code I use NASM:

nasm -f bin exe.asm -o exe

If you take a look to the programHeader, you will see that p_offset is 0, and p_filesz is fileSize. That means that the segment contains the whole file. That's something I wasn't expecting(and I'm not the only one), but apparently the Linux operating system needs the headers to be in a segment of type PT_LOAD so that information gets loaded.

This is the only resource I could find that mentions that fact that the headers are inside one segment: https://www.intezer.com/blog/research/executable-linkable-format-101-part1-sections-segments/

Something important to highlight about segments is that only PT_LOAD segments get loaded into memory. Therefore, every other segment is mapped within the memory range of one of the PT_LOAD segments.

In order to understand the relationship between Sections and Segments, we can picture segments as a tool to make the linux loader’s life easier, as they group sections by attributes into single segments in order to make the loading process of the executable more efficient, instead of loading each individual section into memory. The following diagram attempts to illustrate this concept:

enter image description here

But I don't understand why Linux needs that headers to be loaded at run time. What are they used for? If they are needed for the process to run, couldn't Linux load it by himself?

EDIT:

It has been mentioned in the comments that headers don't need to be loaded, however, they are sometimes loaded anyways to avoid having to add padding. I have tried adding padding to get it 4KB aligned but it didn't work. Here's my attempt:

bits 64
org 0x08048000

elfHeader:
    db  0x7F, "ELF", 2, 1, 1, 0   ; e_ident
    db 0                            ; abi version
    times 7 db 0                    ; unused padding
    dw  2                         ; e_type
    dw  62                        ; e_machine
    dd  1                         ; e_version
    dq  _start                    ; e_entry
    dq  programHeader - $$        ; e_phoff
    dq  0                         ; e_shoff
    dd  0                         ; e_flags
    dw  elfHeaderSize             ; e_ehsize
    dw  programHeaderSize         ; e_phentsize
    dw  1                         ; e_phnum
    dw  0                         ; e_shentsize
    dw  0                         ; e_shnum
    dw  0                         ; e_shstrndx

elfHeaderSize  equ $ - elfHeader

programHeader:
    dd  1                         ; p_type
    dd  7                         ; p_flags
    dq  _start - $$               ; p_offset
    dq  $$                        ; p_vaddr
    dq  $$                        ; p_paddr
    dq  codeSize                  ; p_filesz
    dq  codeSize                  ; p_memsz
    dq  0x1000                    ; p_align

programHeaderSize equ  $ - programHeader

; padding until 4KB
paddingUntil4k equ 4*1024 - ($ - elfHeader)
times paddingUntil4k db 0


_start:
   xor rdi, rdi
   xor eax,eax
   mov al,60
   syscall

codeSize equ $ - _start
fileSize equ $ - $$

Solution

  • But I don't understand why Linux needs that headers to be loaded at run time.

    It doesn't.

    What are they used for? If they are needed for the process to run, couldn't Linux load it by himself?

    To answer all of these questions, you need to look at the Linux kernel source.

    In the source, you can see that in fact program headers do not need to be a part of any PT_LOAD segment, and that the kernel will read them all on its own.

    Changing your original program like so:

    diff -u exe.asm.orig exe.asm
    --- exe.asm.orig        2021-02-07 18:54:34.449336515 -0800
    +++ exe.asm     2021-02-07 18:53:19.773532451 -0800
    @@ -24,9 +24,9 @@
     programHeader:
         dd  1                         ; p_type
         dd  7                         ; p_flags
    -    dq  0                         ; p_offset
    -    dq  $$                        ; p_vaddr
    -    dq  $$                        ; p_paddr
    +    dq  _start - $$               ; p_offset
    +    dq  _start                    ; p_vaddr
    +    dq  _start                    ; p_paddr
         dq  fileSize                  ; p_filesz
         dq  fileSize                  ; p_memsz
         dq  0x1000                    ; p_align
    

    produces a program which runs fine, but in which the program header is not in the PT_LOAD segment:

     eu-readelf --all exe
    ELF Header:
      Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
      Class:                             ELF64
      Data:                              2's complement, little endian
      Ident Version:                     1 (current)
      OS/ABI:                            UNIX - System V
      ABI Version:                       0
      Type:                              EXEC (Executable file)
      Machine:                           AMD x86-64
      Version:                           1 (current)
      Entry point address:               0x8048078
      Start of program headers:          64 (bytes into file)
      Start of section headers:          0 (bytes into file)
      Flags:
      Size of this header:               64 (bytes)
      Size of program header entries:    56 (bytes)
      Number of program headers entries: 1
      Size of section header entries:    0 (bytes)
      Number of section headers entries: 0 ([0] not available)
      Section header string table index: 0
    
    Section Headers:
    [Nr] Name                 Type         Addr             Off      Size     ES Flags Lk Inf Al
    
    Program Headers:
      Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
      LOAD           0x000078 0x0000000008048078 0x0000000008048078 0x000081 0x000081 RWE 0x1000
    

    I have tried adding padding

    You didn't do that correctly. Using your "with padding" source results in the following exe-padding:

    ...
      Entry point address:               0x8049000
    ...
    Program Headers:
      Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
      LOAD           0x001000 0x0000000008048000 0x0000000008048000 0x000009 0x000009 RWE 0x1000
    

    This binary is started by the kernel, and immediately jumps to the start address 0x8049000, which isn't mapped (since it's not covered by the PT_LOAD segment), resulting in immediate SIGSEGV.

    To fix this, you need to adjust the entry address:

    diff -u exe-padding.asm.orig exe-padding.asm
    --- exe-padding.asm.orig        2021-02-07 18:57:31.800871195 -0800
    +++ exe-padding.asm     2021-02-07 19:34:27.303071700 -0800
    @@ -8,7 +8,7 @@
         dw  2                         ; e_type
         dw  62                        ; e_machine
         dd  1                         ; e_version
    -    dq  _start                    ; e_entry
    +    dq  _start - 0x1000           ; e_entry
         dq  programHeader - $$        ; e_phoff
         dq  0                         ; e_shoff
         dd  0                         ; e_flags
    

    This again produces a working executable. For the record:

    eu-readelf --all exe-padding
    ELF Header:
      Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
      Class:                             ELF64
      Data:                              2's complement, little endian
      Ident Version:                     1 (current)
      OS/ABI:                            UNIX - System V
      ABI Version:                       0
      Type:                              EXEC (Executable file)
      Machine:                           AMD x86-64
      Version:                           1 (current)
      Entry point address:               0x8048000
      Start of program headers:          64 (bytes into file)
      Start of section headers:          0 (bytes into file)
      Flags:                             
      Size of this header:               64 (bytes)
      Size of program header entries:    56 (bytes)
      Number of program headers entries: 1
      Size of section header entries:    0 (bytes)
      Number of section headers entries: 0 ([0] not available)
      Section header string table index: 0
    
    Section Headers:
    [Nr] Name                 Type         Addr             Off      Size     ES Flags Lk Inf Al
    
    Program Headers:
      Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
      LOAD           0x001000 0x0000000008048000 0x0000000008048000 0x000009 0x000009 RWE 0x1000
    

    P.S. You are linking your 64-bit program at 0x08048000, which is the traditional load address for i*86 (32-bit) executables. x86_64 binaries traditionally start at 0x400000.

    Update:

    About the first example, p_filesz is still fileSize, I think that should get outside of the boundaries of the file.

    That is correct: p_filesz and p_memsz should be reduced by the size of headers (0x78 here). Note that both of these will be rounded up to page size (after adding p_offset), so for this example there is no practical difference.

    Update 2:

    pastebin.ubuntu.com/p/rgfVMrbcmJ

    This results in the following LOAD segment:

    Program Headers:
      Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
      LOAD           0x000078 0x0000000008048000 0x0000000008048000 0x000081 0x000081 RWE 0x1000
    

    This binary will not run (kernel will reject it), because it is asking the kernel to do the impossible: to mmap bytes at offset 0x78 to page start.

    If the application performed equivalent mmap call, it would have gotten EINVAL error, because mmap requires that (offset % pagesize) == (addr % pagesize).