I'm trying to get a better understanding of the ELF format. To do this, I wrote a small C file called share.c and from that created a shared object called share.so. Below is the contents of share.c:
static int count = 0;
void increment()
{
count++;
}
Below is the command I used to create share.so:
gcc -fPIC -shared -o share.so share.c
I used the readelf tool to look at both the program headers and the section headers within share.so. Below are the program headers in share.so:
Elf file type is DYN (Shared object file)
Entry point 0x550
There are 7 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x00000000000006f4 0x00000000000006f4 R E 200000
LOAD 0x0000000000000e30 0x0000000000200e30 0x0000000000200e30
0x00000000000001f0 0x00000000000001f8 RW 200000
DYNAMIC 0x0000000000000e48 0x0000000000200e48 0x0000000000200e48
0x0000000000000190 0x0000000000000190 RW 8
NOTE 0x00000000000001c8 0x00000000000001c8 0x00000000000001c8
0x0000000000000024 0x0000000000000024 R 4
GNU_EH_FRAME 0x0000000000000674 0x0000000000000674 0x0000000000000674
0x000000000000001c 0x000000000000001c R 4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 10
GNU_RELRO 0x0000000000000e30 0x0000000000200e30 0x0000000000200e30
0x00000000000001d0 0x00000000000001d0 R 1
Section to Segment mapping:
Segment Sections...
00 .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .init .plt .plt.got .text .fini .eh_frame_hdr .eh_frame
01 .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss
02 .dynamic
03 .note.gnu.build-id
04 .eh_frame_hdr
05
06 .init_array .fini_array .jcr .dynamic .got
Below are the section headers in share.so:
There are 27 section headers, starting at offset 0x1820:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .note.gnu.build-i NOTE 00000000000001c8 000001c8
0000000000000024 0000000000000000 A 0 0 4
[ 2] .gnu.hash GNU_HASH 00000000000001f0 000001f0
000000000000003c 0000000000000000 A 3 0 8
[ 3] .dynsym DYNSYM 0000000000000230 00000230
0000000000000138 0000000000000018 A 4 2 8
[ 4] .dynstr STRTAB 0000000000000368 00000368
00000000000000ad 0000000000000000 A 0 0 1
[ 5] .gnu.version VERSYM 0000000000000416 00000416
000000000000001a 0000000000000002 A 3 0 2
[ 6] .gnu.version_r VERNEED 0000000000000430 00000430
0000000000000020 0000000000000000 A 4 1 8
[ 7] .rela.dyn RELA 0000000000000450 00000450
00000000000000c0 0000000000000018 A 3 0 8
[ 8] .init PROGBITS 0000000000000510 00000510
000000000000001a 0000000000000000 AX 0 0 4
[ 9] .plt PROGBITS 0000000000000530 00000530
0000000000000010 0000000000000010 AX 0 0 16
[10] .plt.got PROGBITS 0000000000000540 00000540
0000000000000010 0000000000000000 AX 0 0 8
[11] .text PROGBITS 0000000000000550 00000550
0000000000000116 0000000000000000 AX 0 0 16
[12] .fini PROGBITS 0000000000000668 00000668
0000000000000009 0000000000000000 AX 0 0 4
[13] .eh_frame_hdr PROGBITS 0000000000000674 00000674
000000000000001c 0000000000000000 A 0 0 4
[14] .eh_frame PROGBITS 0000000000000690 00000690
0000000000000064 0000000000000000 A 0 0 8
[15] .init_array INIT_ARRAY 0000000000200e30 00000e30
0000000000000008 0000000000000000 WA 0 0 8
[16] .fini_array FINI_ARRAY 0000000000200e38 00000e38
0000000000000008 0000000000000000 WA 0 0 8
[17] .jcr PROGBITS 0000000000200e40 00000e40
0000000000000008 0000000000000000 WA 0 0 8
[18] .dynamic DYNAMIC 0000000000200e48 00000e48
0000000000000190 0000000000000010 WA 4 0 8
[19] .got PROGBITS 0000000000200fd8 00000fd8
0000000000000028 0000000000000008 WA 0 0 8
[20] .got.plt PROGBITS 0000000000201000 00001000
0000000000000018 0000000000000008 WA 0 0 8
[21] .data PROGBITS 0000000000201018 00001018
0000000000000008 0000000000000000 WA 0 0 8
[22] .bss NOBITS 0000000000201020 00001020
0000000000000008 0000000000000000 WA 0 0 4
[23] .comment PROGBITS 0000000000000000 00001020
0000000000000034 0000000000000001 MS 0 0 1
[24] .shstrtab STRTAB 0000000000000000 0000173b
00000000000000e4 0000000000000000 0 0 1
[25] .symtab SYMTAB 0000000000000000 00001058
0000000000000528 0000000000000018 26 44 8
[26] .strtab STRTAB 0000000000000000 00001580
00000000000001bb 0000000000000000 0 0 1
With this information, I can see that the ELF header and the program headers constitute the first 0x1C8 bytes of the file, which is why the first section (.note.gnu.build-i) starts at offset 0x1C8. The reported offsets for all the sections up to but not including .init_array make sense when you take the alignment requirements into account.
What does not make sense to me is the offset for the section .init_array. The alignment requirement for this section is 8 bytes and the end of the previous section (.eh_frame) is at offset 0x6F4. This would seem to imply that the next section should be located at 0x6F8 (4 bytes of padding). However, readelf reports that the .init_array section starts at offset 0xE30.
I thought that perhaps there was some other section of useful information inserted in this unexpected gap, but hexdump shows nothing but null bytes. This leads me to believe it is some sort of padding. The alignment requirement of the LOAD segment containing the .init_array segment doesn't seem to explain this padding. Part of the map file created by the linker for this shared object is below:
.eh_frame 0x0000000000000690 0x64
*(.eh_frame)
.eh_frame 0x0000000000000690 0x40 /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crti.o
.eh_frame 0x00000000000006d0 0x20 /tmp/ccDzwTL8.o
0x38 (size before relaxing)
.eh_frame 0x00000000000006f0 0x4 /usr/lib/gcc/x86_64-linux-gnu/5/crtendS.o
*(.eh_frame.*)
.gcc_except_table
*(.gcc_except_table .gcc_except_table.*)
.gnu_extab
*(.gnu_extab*)
.exception_ranges
*(.exception_ranges .exception_ranges*)
0x0000000000200e30 . = DATA_SEGMENT_ALIGN (0x200000, 0x1000)
.eh_frame
*(.eh_frame)
*(.eh_frame.*)
.gnu_extab
*(.gnu_extab)
.gcc_except_table
*(.gcc_except_table .gcc_except_table.*)
.exception_ranges
*(.exception_ranges .exception_ranges*)
.tdata
*(.tdata .tdata.* .gnu.linkonce.td.*)
.tbss
*(.tbss .tbss.* .gnu.linkonce.tb.*)
*(.tcommon)
.preinit_array
*(.preinit_array)
.init_array 0x0000000000200e30 0x8
*(SORT(.init_array.*) SORT(.ctors.*))
*(.init_array EXCLUDE_FILE(*crtend?.o *crtend.o *crtbegin?.o *crtbegin.o) .ctors)
.init_array 0x0000000000200e30 0x8 /usr/lib/gcc/x86_64-linux-gnu/5/crtbeginS.o
The line with the comment ". = DATA_SEGMENT_ALIGN (0x200000, 0x1000)" also doesn't seem to explain this padding. I would expect the value of the location counter to be 0x00000000002006F8.
Does anyone with more experience with the details of the ELF format have an explanation for this unexpected padding?
I have discovered where the padding is coming from. According to this page, the AMD64 toolchain provided with Ubuntu is likely to use the -z relro option as a default. This explains why there is a GNU_RELRO entry in the program headers table. The built in default linker script contains a DATA_SEGMENT_RELRO_END(offset, exp) directive before the .got.plt section. According to this page:
When ‘-z relro’ option is not present, DATA_SEGMENT_RELRO_END does nothing, otherwise DATA_SEGMENT_ALIGN is padded so that exp + offset is aligned to the most commonly used page boundary for particular target
This would explain why the offset of the .got.plt section is aligned to the nearest page (0x1000). The sections from .init_array to .got are therefore placed at the end of the previous page which introduces the mysterious padding after the .eh_frame section.
The GNU_RELRO entry in the program header table and the padding disappear when share.so is built using the following command:
gcc -fPIC -shared -Wl,-z,norelro -o share.so share.c