Few days ago I've started playing with PE format. I've done small PE loader which is able to load sections to memory according to their virtual addresses. For example I have section .text
on virtual address 0x1000
, or section .data
on 0x2000
. With my small assembly code, I've loaded PE file on some free location (0x10000
) and I loaded PE sections from it's positions. So, section .text
is on 0x11000
(0x10000 + 0x1000
), .data
is on 0x12000
etc... But when I referenced my data in assembly from .code
location, I found out (in disassembly) that it's pointing to 0x402000
. On internet I found something like image base which is specific for each type of image... But I don't understand how can be .exe
loaded to 0x402000
when there are lots of executables running in Windows for example. Does anybody why is it so, how does it work and how can I teoretically implement it in my very basic system?
Please help.
Virtual memory means that every single process on your computer can use the "same" addresses, since the address spaces of each process are independent. 0x400000 for process A is mapped to a different physical address than 0x400000 for process B by the OS, even if they are the same virtual address (different virtual address spaces).
The default base address for an executable is 0x400000. Your linker hardcodes that base address into the executable and will adjust address references appropriately. Your executable will be loaded at that address when the program is launched. Your assembler or linker should give you a way to change this default base address.
Note that DLLs, on the other hand, have to be loaded at unique addresses because they must coexist in the same process. For this reason, DLLs are normally relocatable, i.e. they can have any base address when loaded to cope with the requirement of putting them at a unique address. (Having multiple non-relocatable DLLs on a system can cause problems, but having multiple non-relocatable .exes on a system is no problem at all.)