Search code examples
cgdbx86-64gnu-assembleropcode

How to Fix x86_64 Memory Offsets (GAS)?


I am working on a project in C, and I've run into an issue. I am trying to hardcode an x86_64 instruction, but the memory addresses aren't coming out quite right. Really, the problem itself is simple; I am just stuck figuring out its solution.

In GDB, I get the following:

(gdb) x /7ib f
...
0x7ffff7ff7005: callq  0x80000040072b`

That's fine and all, except for one thing: the address I want is, according to GDB, 0x40072b (on a related note, out of curiosity, why is the memory address so high for f?) How can I fix this? For reference, here is the hex of the portion I'm working on (just these six bytes):

(gdb) x /6xb 0x7ffff7ff7005
0x7ffff7ff7005: 0x48    0xe8    0x20    0x97    0x40    0x08

Thanks for any and all assistance.

Update:

It has been requested for me to explain how I am coming up with this offset:

Here's what I am working on: I want to implement a way for me to use closures in C, which I am trying to do by implementing what I've found in this article (the part I'm basing this off of is at the very end... and yes, I am aware that the solution to this will be architecture-specific).

Essentially, it encodes a thunk as a packed structure with the necessary opcodes to load an environment and call the desired function's location in memory, which then (to the dismay of Dennis Ritchie's ghost) is cast to a cfunc, which is defined as

typedef void (* cfunc)();

After which, it is called as an ordinary function.

It does this by creating a thunk struct and, among one or two other things, calculating the offset for the callq GAS operation using the following line:

             (Pointer to first byte of the ------+
             instruction following the CALL op)  |
                                                 |
                                                 |
          (Function to be called by CALL op)     |
                      |                          |
                      |                          |
thunk->call_offset = code - (void *)&thunk->add_esp[0];
        |
        +--- A Signed Long in the 32-bit version
             (Because Longs are 8 bits in 64-bit
              GCC, I have changed it to being a
              Signed Int)

I know this is feasible, for it actually works when compiling the original code in 32-bit mode. What I am trying to do is modify the code to work in 64-bit mode. I presume that I need to pad the offset with some sort of value in order to make it point to the correct memory address, but I am unsure what that value is. That, or perhaps there is another way of writing a call opcode which I can use that will point to the correct memory address.


Solution

  • 0x48 0xe8 0x20 0x97 0x40 0x08

    Translates to:

    48 e8 20974008
     |  |    |
     |  |    +------ Offset (As of opcode)
     |  +----------- Opcode (CALL)
     +-------------- REX prefix
    

    48: REX prefix

      0100 1000
        |  ||||
        |  |||+-- B - Extension to MODRM.rm or SIB.base
        |  ||+--- X - Extension to SIB.index
        |  |+---- R - Extension to MODRM.reg
        |  +----- W - 64-bit operand size, else (usually) 32-bit
        +-------- Fixed bit pattern
    

    In other words: 64-bit operand size.

    e8: Opcode

    Looking at instruction manual one find :

    CALL e8 cd

    Call near, relative, displacement relative to next instruction. 32-bit displacement sign extended to 64-bits in 64-bit mode.

    Where:

    cd — A 4-byte value following the opcode. This value is used to specify a code offset and possibly a new value for the code segment register.

    The cd is in this case:

    20 97 40 08

    Ordered from little- to big endian we get:

    08409720 + 6

    We add 6 because the offset is relative to next instruction. As the instruction is six bytes.

    0x48 0xe8 0x20 0x97 0x40 0x08

    In other words:

    callq fun_08409726
    

    In GDB from your print out:

    (gdb) x /7ib f
    ...
    0x7ffff7ff7005: callq  0x80000040072b`
    

    You get the offset from address 0x7ffff7ff7005 by:

    0x7ffff7ff7005 + 0x08409726 = 0x80000040072b
          |              |              |
          |              |              +-------- Result address (same as in GDB).
          |              +----------------------- The offset we calculated above.
          +-------------------------------------- Memory offset of the instruction.
    

    There might be something going on in GDB, but that does not look quite right.

    The (virtual) address 0x000080000040072b is above 0x00007fffffffffff. Reason for the address is due to the instruction offset. Now how this offset is generated. (As you say “I am trying to hardcode an x86_64 instruction”) you might know best self.