Search code examples
clinuxelfembedded-resourceembedding

Why does Linux fail to map this PT_LOAD ELF segment?


I'm trying to embed arbitrary data into an ELF executable and have Linux map it in automatically at load time. Recently asked another question about this which culminated in support for this use case being added to the mold linker.

I have written a tool that appends arbitrary data at the end of the executable and patches in a PT_LOAD ELF program header that points to the appended data. This is the patching logic:

appended_data_file_offset = /* ... seek(elf file, SEEK_END) ... */;
appended_data_size = /* ... stat(data file) ... */;

phdr->p_type = PT_LOAD;

phdr->p_filesz = phdr->p_memsz = appended_data_size;

size_t base = phdr->p_vaddr - phdr->p_offset; // calculate program's base load address
phdr->p_vaddr = phdr->p_paddr = base + appended_data_file_offset;
phdr->p_offset = appended_data_file_offset;

phdr->p_align = 1;
phdr->p_flags = PF_R;

Running my patcher results in an ELF file with this data appended to it at offset 0xAD78:

0000ad70: 00 00 00 00 00 00 00 00 74 65 73 74 20 64 61 74  ........test dat
0000ad80: 61 0a                                            a.

And this PT_LOAD segment added:

 Program Headers:
   Type           Offset             VirtAddr           PhysAddr
                  FileSiz            MemSiz              Flags  Align
   LOAD           0x000000000000ad78 0x000000000020ad78 0x000000000020ad78
                  0x000000000000000a 0x000000000000000a  R      0x1

This new segment and 10 byte block at the end are the only changes made to the perfectly good, working ELF executable. Confirmed by binary comparison.

At runtime, the program is supposed to reach that data. It does so via the auxiliary vector:

Elf64_Phdr *header = (Elf64_Phdr *) getauxval(AT_PHDR);
size_t count = getauxval(AT_PHNUM);
size_t size = getauxval(AT_PHENT);
assert(size == sizeof(Elf64_Phdr));

for (size_t i = 0; i < count; ++header, ++i) {
    if (header->p_type != PT_LOAD) { continue; }
    if (0 == memcmp(header->p_vaddr, "test", sizeof("test") - 1)) {
        // found it
    }
}

I used libc functions for clarity. My actual program is a static EXEC ELF file written in freestanding C. It does not link to libc and uses Linux system calls directly.

After patching the executable in this manner, I intended for this to happen:

  1. Linux automatically loads into memory the data appended to the executable.
    • The 10 byte block located at offset 0xAD78 in the file.
  2. Program finds the program header table via AT_PHDR value in the auxiliary vector.
  3. Program scans PT_LOAD segments until it finds the data.
    • The p_vaddr of one of these headers should point to a memory block containing "test data\n"

Instead, this program just completely crashes. Does not execute a single instruction. Does not even reach the entry point. Not even gdb can debug it:

(gdb) run
Starting program: exe.patched
During startup program terminated with signal SIGSEGV, Segmentation fault.
(gdb) info registers
The program has no registers now.
(gdb) step
The program is not being run.

It runs without any problems without that PT_LOAD header though. It also works if I change the type to PT_LOOS or any other type.

I can't figure it out. Just what am I doing wrong?


The complete readelf printout as requested:

$ readelf --file-header --program-headers program.patched
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           AArch64
  Version:                           0x1
  Entry point address:               0x2037d8
  Start of program headers:          64 (bytes into file)
  Start of section headers:          43512 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         5
  Size of section headers:           64 (bytes)
  Number of section headers:         8
  Section header string table index: 6

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x000000000000abf8 0x000000000020abf8 0x000000000020abf8
                 0x000000000000000a 0x000000000000000a  R      0x1
  LOAD           0x0000000000000000 0x0000000000200000 0x0000000000200000
                 0x00000000000027d8 0x00000000000027d8  R      0x1000
  LOAD           0x00000000000027d8 0x00000000002037d8 0x00000000002037d8
                 0x0000000000005ed8 0x0000000000005ed8  R E    0x1000
  LOAD           0x00000000000086b0 0x000000000020a6b0 0x000000000020a6b0
                 0x0000000000000000 0x0000000000100015  RW     0x1000
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     0x0

 Section to Segment mapping:
  Segment Sections...
   00     
   01     .rodata 
   02     .text 
   03     .bss 
   04    

Solution

  • PT_LOAD headers must be ordered in ascending order of the virtual address. Your new program header has a higher p_vaddr than all the following PT_LOAD headers.

    Also, the segment's virtual address ranges shouldn't be overlapping, but your new segment lies inside the last one. The relevant size of a mapped segment is the larger of p_filesz and p_memsz.

    This is documented in man 5 elf.