I am working on a project in C, and I've run into an issue. I am trying to hardcode an x86_64 instruction, but the memory addresses aren't coming out quite right. Really, the problem itself is simple; I am just stuck figuring out its solution.
In GDB, I get the following:
(gdb) x /7ib f
...
0x7ffff7ff7005: callq 0x80000040072b`
That's fine and all, except for one thing: the address I want is, according to GDB, 0x40072b
(on a related note, out of curiosity, why is the memory address so high for f
?) How can I fix this? For reference, here is the hex of the portion I'm working on (just these six bytes):
(gdb) x /6xb 0x7ffff7ff7005
0x7ffff7ff7005: 0x48 0xe8 0x20 0x97 0x40 0x08
Thanks for any and all assistance.
Update:
It has been requested for me to explain how I am coming up with this offset:
Here's what I am working on: I want to implement a way for me to use closures in C, which I am trying to do by implementing what I've found in this article (the part I'm basing this off of is at the very end... and yes, I am aware that the solution to this will be architecture-specific).
Essentially, it encodes a thunk as a packed structure with the necessary opcodes to load an environment and call the desired function's location in memory, which then (to the dismay of Dennis Ritchie's ghost) is cast to a cfunc
, which is defined as
typedef void (* cfunc)();
After which, it is called as an ordinary function.
It does this by creating a thunk struct
and, among one or two other things, calculating the offset for the callq
GAS operation using the following line:
(Pointer to first byte of the ------+
instruction following the CALL op) |
|
|
(Function to be called by CALL op) |
| |
| |
thunk->call_offset = code - (void *)&thunk->add_esp[0];
|
+--- A Signed Long in the 32-bit version
(Because Longs are 8 bits in 64-bit
GCC, I have changed it to being a
Signed Int)
I know this is feasible, for it actually works when compiling the original code in 32-bit mode. What I am trying to do is modify the code to work in 64-bit mode. I presume that I need to pad the offset with some sort of value in order to make it point to the correct memory address, but I am unsure what that value is. That, or perhaps there is another way of writing a call opcode which I can use that will point to the correct memory address.
0x48 0xe8 0x20 0x97 0x40 0x08
Translates to:
48 e8 20974008
| | |
| | +------ Offset (As of opcode)
| +----------- Opcode (CALL)
+-------------- REX prefix
0100 1000
| ||||
| |||+-- B - Extension to MODRM.rm or SIB.base
| ||+--- X - Extension to SIB.index
| |+---- R - Extension to MODRM.reg
| +----- W - 64-bit operand size, else (usually) 32-bit
+-------- Fixed bit pattern
In other words: 64-bit operand size.
Looking at instruction manual one find :
CALL e8 cd
Call near, relative, displacement relative to next instruction. 32-bit displacement sign extended to 64-bits in 64-bit mode.
Where:
cd — A 4-byte value following the opcode. This value is used to specify a code offset and possibly a new value for the code segment register.
The cd is in this case:
20 97 40 08
Ordered from little- to big endian we get:
08409720 + 6
We add 6 because the offset is relative to next instruction. As the instruction is six bytes.
0x48 0xe8 0x20 0x97 0x40 0x08
In other words:
callq fun_08409726
In GDB from your print out:
(gdb) x /7ib f
...
0x7ffff7ff7005: callq 0x80000040072b`
You get the offset from address 0x7ffff7ff7005
by:
0x7ffff7ff7005 + 0x08409726 = 0x80000040072b
| | |
| | +-------- Result address (same as in GDB).
| +----------------------- The offset we calculated above.
+-------------------------------------- Memory offset of the instruction.
There might be something going on in GDB, but that does not look quite right.
The (virtual) address 0x000080000040072b
is above 0x00007fffffffffff
. Reason for the address is due to the instruction offset. Now how this offset is generated. (As you say “I am trying to hardcode an x86_64 instruction”) you might know best self.