linux x86-64 elf aslr position-independent-code

why non-pic code can't be totally ASLR using run-time fixups?

I understand that PIC code makes ASLR randomization more efficient and easier since the code can be placed anywhere in memory with no change in code. But if i understand right according to Wikipedia relocation dynamic linker can make "fixups" at runtime so a symbol can be located although code being not position-independent. But according to many answers i saw here non-pic code can't ASLR sections except the stack(so cant randomize program entry point). If that is correct then what are runtime fixups are used for and why can't we just fixup all locations in code at runtime before program start to make program entry point randomized.

Solution

TL:DR: Not all uses of absolute address will have relocation info in a non-PIE executable (ELF type EXEC, not DYN). Therefore the kernel's program-loader can't find them all to apply fixups.

Thus there's no way to retroactively enable ASLR for executables built as non-PIE. There's no way for a traditional executable to flag itself as having relocation metadata for every use of an absolute address, and no point in adding such a feature either since if you want text ASLR you'd just build a PIE.

Because ELF-type EXEC Linux executables are guaranteed to be loaded / mapped at the fixed base address chosen by the linker at link time, it would be a waste of space in the executable to make symbol-table entries for internal symbols. So toolchains didn't do that, and there's no reason to start. That's simply how traditional ELF executables were designed; Linux switched from a.out to ELF back in the mid 90s before stack ASLR was a thing, so it wasn't on people's radar.

e.g. the absolute address of static char buf[100] is probably embedded somewhere in the machine code that uses it (if we're talking about 32-bit code, or 64-bit code that puts the address in a register), but there's no way to know where or how many times.

Also, for x86-64 specifically, the default code model for non-PIE executables guarantees that static addresses (text / data / bss) will all be in the low 2GiB of virtual address space, so 32-bit absolute signed or unsigned addresses can work, and rel32 displacements can reach anything from anything. That's why non-PIE compiler output uses mov $symbol, %edi (5 bytes) to put an address in a register, instead of lea symbol(%rip), %rdi (7 bytes). https://godbolt.org/z/89PeK1

So even if you did know where every absolute address was, you could only ASLR it in the low 2GiB, limiting the number of bits of entropy you could introduce. (I think Windows has a mode for this: LargeAddressAware = no. But Linux doesn't. 32-bit absolute addresses no longer allowed in x86-64 Linux? Again, PIE is a better way to allow text ASLR, so people (distros) should just compile for that if they want its benefits.)

Unlike Windows, Linux doesn't spend huge effort on things that can be handled better and more efficiently by recompiling binaries from source.

That being said, GNU/Linux does support fixup relocations for 64-bit absolute addresses even in PIC / PIE ELF shared objects. That's why beginner code like NASM mov rdi, BUFFER can work even in a shared library: use objdump -drwC -Mintel to see the relocation info on that use of the symbol in a mov reg, imm64 instruction. An lea rdi, [rel BUFFER] wouldn't need any relocation entry if BUFFER wasn't a global symbol. (Equivalent of C static.)

You might be wondering why metadata is essential:

There's no reliable way to search text/data for possible absolute addresses; false positives would be possible. e.g. /usr/bin/ld probably contains 0x401000 as the default start address for an x86-64 executable. You don't want ASLR of ld's code+data to also change its defaults. Or that integer value could have come up in any number of ways in many programs, e.g. as a bitmap. And of course x86-64 machine code is variable length so there's no reliable way to even distinguish opcodes from immediate operands in the most general case.

And also potentially false negatives. Not super likely that an x86 program would construct an absolute address in a register with multiple instructions, but it's certainly possible. However in non-x86 code, that would be common.

RISC machines with fixed-length instructions can't put a 32-bit address into a 32-bit instruction; there'd be no room left for anything else. So to load from static storage, the absolute addresses would have to be split across multiple instructions, like MIPS lui $t0, %hi(0x612300) / lw $t1, %lo(0x612300)($t0) to load from a static variable at absolute address 0x612300. (There would normally be a symbol name in the asm source, but it wouldn't appear in the final linked binary unless it was .globl, so I used numbers as a reminder.) Instructions like that don't have to come in pairs; the same high-half of the address could be reused by other accesses into the same array or struct in later instructions.