I encountered a following assembly instruction and rax, qword ptr [0xff5ff098]
What i want to know is what is memory address which will be accessed in the following instruction.
Will the memory address 0xff5ff098
be zero extended or will it be extended by most significant bit ?
In Intel Pin
tool the api IARG_MEMORYREAD_EA
is giving it as extended by 1 i.e. it is giving the following address 0xffffffffff5ff098
. Is this address a possible address ?
I am working on a 64 bit machine.
How is it encoded? RIP-relative or absolute?
If absolute, it's using the sign-extended-disp32 addressing mode (because 32-bit displacements in addressing modes are always sign extended, even when there are no registers involved).
If it's RIP-relative, then your disassembler should be showing you the correct final address, calculated from RIP + rel32.
Either your disassembler or PIN are showing it incorrectly, since if it's really sign-extended to 64-bit, your disassembler should be showing you that.
And yes, both RIP-relative and absolute addressing are possible in x86-64.
x86-32 had two redundant ways to encode [disp32]
addressing modes with no registers. x86-64 repurposes the shorter one as RIP-relative, and leaves the longer one as [sign-extended-disp32]
absolute addressing.
I'll use NASM syntax for an example. You can use default rel
to use RIP-relative by default, and you can override on a case-by-case basis like this:
MOV RAX, [abs FS:_start] ; _start just as something that assembles
MOV RAX, [rel FS:_start] ; RIP-rel for thread-local is usually not useful!
64 48 8b 04 25 b5 00 40 00 mov rax,QWORD PTR fs:0x4000b5
64 48 8b 05 e7 fe ff ff mov rax,QWORD PTR fs:[rip+0xfffffffffffffee7] # 4000b5 <_start>
Is this address a possible address ?
Yes, addresses in the upper half of the canonical range are used. e.g. less /proc/self/maps
shows that Linux maps the vsyscall
page it exports into high address space:
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Canonical address means that bits [63:48]
are copies of bit 47. i.e. the address is what you would get from sign-extending the low 48 bits. Non-canonical addresses will always fault on current hardware, so if you want to implement something like tagged pointers with those redundant bits, you still have to redo the sign-extension before dereferencing.
Note that qword ptr
tells you the operand-size, not anything about how the addressing mode is encoded.