I'm on Ubuntu 20.04, gcc
9.3.0, ld
2.34.
I have a simple hello world program that does not use glibc
or any other library and just uses write syscall. Despite this, my binary size is roughly 8Kb. I'm unsure as to why it is that large and not say 1Kb.
C Program:
int
x64_syscall_write(int fd, char const *data, unsigned long int data_size)
{
int result = 0;
__asm__ __volatile__("syscall"
: "=a" (result)
: "a" (1), "D" (fd),
"S" (data), "d" (data_size)
: "r11", "rcx", "memory");
return result;
}
__asm__(".global entry_point\n"
"entry_point:\n"
"xor rbp, rbp\n"
"pop rdi\n"
"mov rsi, rsp\n"
"and rsp, 0xfffffffffffffff0\n"
"call main\n"
"mov rdi, rax\n"
"mov rax, 60\n"
"syscall\n"
"ret");
int
main(int argc, char *argv[])
{
x64_syscall_write(1, "hello\n", 6);
return 0;
}
Built with:
gcc -ffreestanding -static -nostdlib -no-pie -masm=intel \
-fno-unwind-tables -fno-asynchronous-unwind-tables \
-Wl,--gc-sections -fdata-sections -Os \
hello.c -c -o hello.o
# NOTE: I know more could be done here to shave
# off a few more bytes, but I feel this is the bulk of it.
ld -e entry_point hello.o -o hello
hello.o
is 1.7Kb.
hello
is 8.4Kb.
readelf -Wl hello
Elf file type is EXEC (Executable file)
Entry point 0x40101c
There are 6 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0001b0 0x0001b0 R 0x1000
LOAD 0x001000 0x0000000000401000 0x0000000000401000 0x000045 0x000045 R E 0x1000
LOAD 0x002000 0x0000000000402000 0x0000000000402000 0x000007 0x000007 R 0x1000
NOTE 0x000190 0x0000000000400190 0x0000000000400190 0x000020 0x000020 R 0x8
GNU_PROPERTY 0x000190 0x0000000000400190 0x0000000000400190 0x000020 0x000020 R 0x8
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10
Section to Segment mapping:
Segment Sections...
00 .note.gnu.property
01 .text
02 .rodata
03 .note.gnu.property
04 .note.gnu.property
05
Here you can see that the linker created 3 LOAD
segments: one for the ELF
header and other metadata, one for .text
and one for .rodata
.
Linking with -z noseparate-code
results in much smaller binary (smaller than hello.o
):
ls -l hello*
-rwxr-xr-x 1 user user 1384 Apr 26 22:24 hello
-rw-r--r-- 1 user user 603 Apr 26 22:22 hello.c
-rw-r--r-- 1 user user 1680 Apr 26 22:22 hello.o
readelf -Wl hello
Elf file type is EXEC (Executable file)
Entry point 0x40015c
There are 4 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x00018c 0x00018c R E 0x1000
NOTE 0x000120 0x0000000000400120 0x0000000000400120 0x000020 0x000020 R 0x8
GNU_PROPERTY 0x000120 0x0000000000400120 0x0000000000400120 0x000020 0x000020 R 0x8
GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10
Section to Segment mapping:
Segment Sections...
00 .note.gnu.property .text .rodata
01 .note.gnu.property
02 .note.gnu.property
03
You can shrink this further by removing .note.GNU-stack
and .note.gnu.property
sections:
objcopy -R .note.* hello.o hello1.o
ld -e entry_point hello1.o -o hello1 -z noseparate-code
ls -l hello1*
-rwxr-xr-x 1 user user 1072 Apr 26 22:38 hello1
-rw-r--r-- 1 user user 1440 Apr 26 22:37 hello1.o
readelf -Wl hello1
Elf file type is EXEC (Executable file)
Entry point 0x400094
There is 1 program header, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0000c4 0x0000c4 R E 0x1000
Section to Segment mapping:
Segment Sections...
00 .text .rodata