Search code examples
linuxdebuggingassemblyx86aes

Why I cannot single stepping into aeskeygenassist instruction in self-modifying code?


I tried implementing aes128 encryption using assembly language, my final goal is to find out the final value. when debugging (using single stepping), the debugger stops at the 0x8048074 address.

Here the code :

global _start
section .text

_start:

pxor    xmm2, xmm2
pxor    xmm3, xmm3
mov     bx, 0x36e5
mov     ah, 0x73

roundloop:
shr     ax, 7
div     bl
mov     byte [sdfsdf+5], ah

sdfsdf:
aeskeygenassist xmm1, xmm0, 0x45
pshufd  xmm1, xmm1, 0xff

shuffle:
shufps  xmm2, xmm0, 0x10
pxor    xmm0, xmm2
xor     byte [shuffle+3], 0x9c
js      short shuffle

pxor    xmm0, xmm1
cmp     ah, bh
jz      short lastround

aesenc  xmm3, xmm0
jmp     short roundloop

lastround:
aesenclast xmm3, xmm0
ret

Debugger stuck at here, I cannot single-stepping to 0x804807a

[-------------------------------------code-------------------------------------]
   0x804806c <_start+12>:       mov    ah,0x73
   0x804806e <roundloop>:       shr    ax,0x7
   0x8048072 <roundloop+4>:     div    bl
=> 0x8048074 <roundloop+6>:     mov    BYTE PTR ds:0x804807f,ah
   0x804807a <sdfsdf>:  aeskeygenassist xmm1,xmm0,0x45
   0x8048080 <sdfsdf+6>:        pshufd xmm1,xmm1,0xff
   0x8048085 <shuffle>: shufps xmm2,xmm0,0x10
   0x8048089 <shuffle+4>:       pxor   xmm0,xmm2

I'm using peda plugin for GDB.

EDIT :

Sorry, I don't mention the error message, error message is Segmentation fault at this instruction mov BYTE PTR ds:0x804807f,ah


Solution

  • I assume you forgot to link with --omagic to make the .text section writable.


    So mov BYTE PTR ds:0x804807f,ah segfaults, and it's right before aeskeygenassist. You can't keep single-stepping after your program crashes. (You have no handler for SIGSEGV, and the default action is to terminate your program).

    When I tried this on my desktop out of curiosity, I can imagine interpreting the behaviour as single-stepping getting "stuck" before aeskeygenassist, if I ignore the segfault message!!! and the fact that trying again says "the program is no longer running".

    From a GDB session:

    (gdb) layout reg
    (gdb) starti          # like run with an implicit breakpoint on the first instruction
    (gdb) si
    0x0000000000401004 in _start ()
    0x0000000000401008 in _start ()     ## I kept pressing return to repeat the command
    0x000000000040100c in _start ()
    0x000000000040100e in roundloop ()
    0x0000000000401012 in roundloop ()
    0x0000000000401014 in roundloop ()    # the MOV store
    
    Program received signal SIGSEGV, Segmentation fault.
    0x0000000000401014 in roundloop ()    # still pointing at the MOV store
    

    Notice that RIP is still pointing at the mov. 0x8048074 in your 32-bit build, 0x401014 in my 64-bit build of the same source.


    From the ld manual:

    -N
    --omagic
    Set the text and data sections to be readable and writable. Also, do not page-align the data segment, and disable linking against shared libraries. If the output format supports Unix style magic numbers, mark the output as "OMAGIC". Note: Although a writable text section is allowed for PE-COFF targets, it does not conform to the format specification published by Microsoft.

    Your code works fine for me if I link with:

      nasm -felf64 aes.asm &&
      ld --omagic aes.o -o aes
    

    Alternatively, you could make an mprotect system call to give the page containing this code PROT_READ|PROT_WRITE|PROT_EXEC.

    GDB's layout reg disassembly window even updates disassembly for aeskeygenassist after its immediate is modified by store.


    Also note that Self-Modifying Code (SMC) is extremely slow on modern x86. Full pipeline nuke after every store near instructions being executed. You'd be much better off unrolling with an assembler macro.

    Also, you can't ret from _start under Linux; it's not a function. The stack pointer points to argc, not a return address. Make an _exit system call with int 0x80 for 32-bit code. When I say "works" I meant it reaches that ret and segfaults on code-fetch from address 1 after popping argc into RIP.

    Also, use default rel for RIP-relative addressing of the store; it's more compact. Or I guess you're building a 32-bit executable out of this for some reason, based on your code addresses. I didn't notice that at first, that's why I tested as a 64-bit executable. Fortunately you used labels correctly, and aeskeygenassist is the same length in both modes, so it still works.