Search code examples
ldelflinker-scripts

ld linker script producing huge binary


I'm using binutils-2.21.53.0.1-6.fc16.x86_64.

I have a small object file, hello.o with just enough "stuff" to have contents in all sections:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .text             PROGBITS         0000000000000000  00000040
       000000000000005d  0000000000000000  AX       0     0     4
  [ 2] .rela.text        RELA             0000000000000000  00000808
       0000000000000060  0000000000000018          15     1     8
  [ 3] .data             PROGBITS         0000000000000000  000000a0
       0000000000000000  0000000000000000  WA       0     0     4
  [ 4] .bss              NOBITS           0000000000000000  000000a0
       0000000000000053  0000000000000000  WA       0     0     32
  [ 5] .rodata           PROGBITS         0000000000000000  000000a0
       000000000000000f  0000000000000000   A       0     0     1
  [ 6] .data.rel.local   PROGBITS         0000000000000000  000000b0
       0000000000000008  0000000000000000  WA       0     0     8
  [ 7] .rela.data.rel.lo RELA             0000000000000000  00000868
       0000000000000018  0000000000000018          15     6     8
  [ 8] .data.rel         PROGBITS         0000000000000000  000000b8
       0000000000000008  0000000000000000  WA       0     0     8
  [ 9] .rela.data.rel    RELA             0000000000000000  00000880
       0000000000000018  0000000000000018          15     8     8
  [10] .comment          PROGBITS         0000000000000000  000000c0
       000000000000002d  0000000000000001  MS       0     0     1
  [11] .note.GNU-stack   PROGBITS         0000000000000000  000000ed
       0000000000000000  0000000000000000           0     0     1
  [12] .eh_frame         PROGBITS         0000000000000000  000000f0
       0000000000000058  0000000000000000   A       0     0     8
  [13] .rela.eh_frame    RELA             0000000000000000  00000898
       0000000000000030  0000000000000018          15    12     8
  [14] .shstrtab         STRTAB           0000000000000000  00000148
       0000000000000085  0000000000000000           0     0     1
  [15] .symtab           SYMTAB           0000000000000000  00000610
       00000000000001b0  0000000000000018          16    11     8
  [16] .strtab           STRTAB           0000000000000000  000007c0
       0000000000000045  0000000000000000           0     0     1

If I use -pie and no linker script, the results are as expected:

$ ld -pie -Map hello_pie.map -o hello_pie.elf hello.o 

$ ll hello_pie.elf 
-rwxrwx---. 1 jreinhart jreinhart 3453 Mar 13 23:44 hello_pie.elf

However, if I include any sort of linker script, the output size explodes:

$ cat 1.ld 
SECTIONS
{

}
$ ld -T 1.ld -pie -Map hello_pie.map -o hello_pie.elf hello.o 
$ ll hello_pie.elf 
-rwxrwx---. 1 jreinhart jreinhart 2100070 Mar 13 23:45 hello_pie.elf

As you can see, this file became huge.

Note that this appears to happen because the .text section insists on starting at offset 0x200000 in the file:

$ readelf -l -S hello_pie.elf 
There are 19 section headers, starting at offset 0x200400:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .text             PROGBITS         0000000000000000  00200000  <--- Why?
       000000000000005d  0000000000000000  AX       0     0     4
  [ 2] .rodata           PROGBITS         000000000000005d  0020005d
       000000000000000f  0000000000000000   A       0     0     1
  [ 3] .eh_frame         PROGBITS         0000000000000070  00200070
       0000000000000058  0000000000000000   A       0     0     8
  [ 4] .interp           PROGBITS         00000000000000c8  002000c8
       000000000000000f  0000000000000000   A       0     0     1
  [ 5] .dynsym           DYNSYM           00000000000000d8  002000d8
       0000000000000078  0000000000000018   A       6     2     8
  [ 6] .dynstr           STRTAB           0000000000000150  00200150
       0000000000000014  0000000000000000   A       0     0     1
  [ 7] .hash             HASH             0000000000000168  00200168
       0000000000000028  0000000000000004   A       5     0     8
  [ 8] .rela.dyn         RELA             0000000000000190  00200190
       0000000000000078  0000000000000018   A       5     0     8
  [ 9] .data.rel.local   PROGBITS         0000000000000208  00200208
       0000000000000008  0000000000000000  WA       0     0     8
  [10] .data.rel         PROGBITS         0000000000000210  00200210
       0000000000000008  0000000000000000  WA       0     0     8
  [11] .dynamic          DYNAMIC          0000000000000218  00200218
       00000000000000f0  0000000000000010  WA       6     0     8
  [12] .got              PROGBITS         0000000000000308  00200308
       0000000000000018  0000000000000008  WA       0     0     8
  [13] .got.plt          PROGBITS         0000000000000320  00200320
       0000000000000018  0000000000000008  WA       0     0     8
  [14] .bss              NOBITS           0000000000000340  00200338
       0000000000000053  0000000000000000  WA       0     0     32
  [15] .comment          PROGBITS         0000000000000000  00200338
       000000000000002c  0000000000000001  MS       0     0     1
  [16] .shstrtab         STRTAB           0000000000000000  00200364
       000000000000009a  0000000000000000           0     0     1
  [17] .symtab           SYMTAB           0000000000000000  002008c0
       0000000000000258  0000000000000018          18    19     8
  [18] .strtab           STRTAB           0000000000000000  00200b18
       000000000000004e  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)

Elf file type is DYN (Shared object file)
Entry point 0x0
There are 5 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000200040 0x0000000000000000
                 0x0000000000000118 0x0000000000000118  R E    8
  INTERP         0x00000000002000c8 0x00000000000000c8 0x00000000000000c8
                 0x000000000000000f 0x000000000000000f  R      1
      [Requesting program interpreter: /lib/ld64.so.1]
  LOAD       --> 0x0000000000200000 0x0000000000000000 0x0000000000000000
                 0x0000000000000338 0x0000000000000393  RWE    200000
  DYNAMIC        0x0000000000200218 0x0000000000000218 0x0000000000000218
                 0x00000000000000f0 0x00000000000000f0  RW     8
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     8

This has been happening regardless of the contents of my linker script. Any ideas what's going on?


Solution

  • By default ld page-aligns input sections. Since your kernel enforces superpages (pages of 2MB = 0x200000 bytes) your .text section gets aligned at offset 0x200000. It seems like a bug in ld as it should use offset 0x0000000 instead (see EDIT below for a possible explanation)

    To prevent this alignment which creates a bigger file, you can use the --nmagic flag to ld to prevent it from page-aligning your .text section although it has side effects (it also disables linking against shared libraries). Be careful though to align other sections (.data, .rodata,...) to 2M pages because they can't live in the same page as .text since all these sections require different access bits.

    EDIT: thinking about it, we all expect accesses to virtual address 0x00000000 to generate an exception (segfault). To do so, I see two possibilities: either the kernel maps a page with no access rights (r/w/x) or (more likely) it simply doesn't map anything (no page mapped => segfault) and the linker must know that somehow... that could explain why ld skips the first page which is at address zero. This is TBC.