I was reading the article Tips for Evading Anti-Virus During Pen Testing and was surprised by given Python program:
from ctypes import *
shellcode = '\xfc\xe8\x89\x00\x00....'
memorywithshell = create_string_buffer(shellcode, len(shellcode))
shell = cast(memorywithshell, CFUNCTYPE(c_void_p))
shell()
The shellcode is shortened. Can someone explain what is going on? I'm familiar with both Python and C, I've tried read on the ctypes
module, but there are two main questions left:
What is stored in shellcode
?
I know this has something to do with C (in the article it is an shellcode from Metasploit and a different notation for ASCII was chosen), but I cannot identify whether if it's C source (probably not) or originates from some sort of compilation (which?).
Depending on the first question, what's the magic happening during the cast?
Have a look at this shellcode, I toke it from here (it pops up a MessageBoxA):
#include <stdio.h>
typedef void (* function_t)(void);
unsigned char shellcode[] =
"\xFC\x33\xD2\xB2\x30\x64\xFF\x32\x5A\x8B"
"\x52\x0C\x8B\x52\x14\x8B\x72\x28\x33\xC9"
"\xB1\x18\x33\xFF\x33\xC0\xAC\x3C\x61\x7C"
"\x02\x2C\x20\xC1\xCF\x0D\x03\xF8\xE2\xF0"
"\x81\xFF\x5B\xBC\x4A\x6A\x8B\x5A\x10\x8B"
"\x12\x75\xDA\x8B\x53\x3C\x03\xD3\xFF\x72"
"\x34\x8B\x52\x78\x03\xD3\x8B\x72\x20\x03"
"\xF3\x33\xC9\x41\xAD\x03\xC3\x81\x38\x47"
"\x65\x74\x50\x75\xF4\x81\x78\x04\x72\x6F"
"\x63\x41\x75\xEB\x81\x78\x08\x64\x64\x72"
"\x65\x75\xE2\x49\x8B\x72\x24\x03\xF3\x66"
"\x8B\x0C\x4E\x8B\x72\x1C\x03\xF3\x8B\x14"
"\x8E\x03\xD3\x52\x33\xFF\x57\x68\x61\x72"
"\x79\x41\x68\x4C\x69\x62\x72\x68\x4C\x6F"
"\x61\x64\x54\x53\xFF\xD2\x68\x33\x32\x01"
"\x01\x66\x89\x7C\x24\x02\x68\x75\x73\x65"
"\x72\x54\xFF\xD0\x68\x6F\x78\x41\x01\x8B"
"\xDF\x88\x5C\x24\x03\x68\x61\x67\x65\x42"
"\x68\x4D\x65\x73\x73\x54\x50\xFF\x54\x24"
"\x2C\x57\x68\x4F\x5F\x6F\x21\x8B\xDC\x57"
"\x53\x53\x57\xFF\xD0\x68\x65\x73\x73\x01"
"\x8B\xDF\x88\x5C\x24\x03\x68\x50\x72\x6F"
"\x63\x68\x45\x78\x69\x74\x54\xFF\x74\x24"
"\x40\xFF\x54\x24\x40\x57\xFF\xD0";
void real_function(void) {
puts("I'm here");
}
int main(int argc, char **argv)
{
function_t function = (function_t) &shellcode[0];
real_function();
function();
return 0;
}
Compile it an hook it under any debugger, I'll use gdb:
> gcc shellcode.c -o shellcode
> gdb -q shellcode.exe
Reading symbols from shellcode.exe...done.
(gdb)
>
Disassemble the main function to see that different between calling real_function
and function
:
(gdb) disassemble main
Dump of assembler code for function main:
0x004013a0 <+0>: push %ebp
0x004013a1 <+1>: mov %esp,%ebp
0x004013a3 <+3>: and $0xfffffff0,%esp
0x004013a6 <+6>: sub $0x10,%esp
0x004013a9 <+9>: call 0x4018e4 <__main>
0x004013ae <+14>: movl $0x402000,0xc(%esp)
0x004013b6 <+22>: call 0x40138c <real_function> ; <- here we call our `real_function`
0x004013bb <+27>: mov 0xc(%esp),%eax
0x004013bf <+31>: call *%eax ; <- here we call the address that is loaded in eax (the address of the beginning of our shellcode)
0x004013c1 <+33>: mov $0x0,%eax
0x004013c6 <+38>: leave
0x004013c7 <+39>: ret
End of assembler dump.
(gdb)
There are two call
, let's make a break point at <main+31>
to see what is loaded in eax:
(gdb) break *(main+31)
Breakpoint 1 at 0x4013bf
(gdb) run
Starting program: shellcode.exe
[New Thread 2856.0xb24]
I'm here
Breakpoint 1, 0x004013bf in main ()
(gdb) disassemble
Dump of assembler code for function main:
0x004013a0 <+0>: push %ebp
0x004013a1 <+1>: mov %esp,%ebp
0x004013a3 <+3>: and $0xfffffff0,%esp
0x004013a6 <+6>: sub $0x10,%esp
0x004013a9 <+9>: call 0x4018e4 <__main>
0x004013ae <+14>: movl $0x402000,0xc(%esp)
0x004013b6 <+22>: call 0x40138c <real_function>
0x004013bb <+27>: mov 0xc(%esp),%eax
=> 0x004013bf <+31>: call *%eax ; now we are here
0x004013c1 <+33>: mov $0x0,%eax
0x004013c6 <+38>: leave
0x004013c7 <+39>: ret
End of assembler dump.
(gdb)
Look at the first 3 bytes of the data that the address in eax continues:
(gdb) x/3x $eax
0x402000 <shellcode>: 0xfc 0x33 0xd2
(gdb) ^-------^--------^---- the first 3 bytes of the shellcode
So the CPU will call 0x402000
, the beginning of our shell code at 0x402000
, lets disassemble what ever at 0x402000
:
(gdb) disassemble 0x402000
Dump of assembler code for function shellcode:
0x00402000 <+0>: cld
0x00402001 <+1>: xor %edx,%edx
0x00402003 <+3>: mov $0x30,%dl
0x00402005 <+5>: pushl %fs:(%edx)
0x00402008 <+8>: pop %edx
0x00402009 <+9>: mov 0xc(%edx),%edx
0x0040200c <+12>: mov 0x14(%edx),%edx
0x0040200f <+15>: mov 0x28(%edx),%esi
0x00402012 <+18>: xor %ecx,%ecx
0x00402014 <+20>: mov $0x18,%cl
0x00402016 <+22>: xor %edi,%edi
0x00402018 <+24>: xor %eax,%eax
0x0040201a <+26>: lods %ds:(%esi),%al
0x0040201b <+27>: cmp $0x61,%al
0x0040201d <+29>: jl 0x402021 <shellcode+33>
....
As you see, a shellcode is nothing more than assembly instructions, the only different is in the way you write these instructions, it uses special techniques to make it more portable, for example never use a fixed address.
The python equivalent to the above program:
#!python
from ctypes import *
shellcode_data = "\
\xFC\x33\xD2\xB2\x30\x64\xFF\x32\x5A\x8B\
\x52\x0C\x8B\x52\x14\x8B\x72\x28\x33\xC9\
\xB1\x18\x33\xFF\x33\xC0\xAC\x3C\x61\x7C\
\x02\x2C\x20\xC1\xCF\x0D\x03\xF8\xE2\xF0\
\x81\xFF\x5B\xBC\x4A\x6A\x8B\x5A\x10\x8B\
\x12\x75\xDA\x8B\x53\x3C\x03\xD3\xFF\x72\
\x34\x8B\x52\x78\x03\xD3\x8B\x72\x20\x03\
\xF3\x33\xC9\x41\xAD\x03\xC3\x81\x38\x47\
\x65\x74\x50\x75\xF4\x81\x78\x04\x72\x6F\
\x63\x41\x75\xEB\x81\x78\x08\x64\x64\x72\
\x65\x75\xE2\x49\x8B\x72\x24\x03\xF3\x66\
\x8B\x0C\x4E\x8B\x72\x1C\x03\xF3\x8B\x14\
\x8E\x03\xD3\x52\x33\xFF\x57\x68\x61\x72\
\x79\x41\x68\x4C\x69\x62\x72\x68\x4C\x6F\
\x61\x64\x54\x53\xFF\xD2\x68\x33\x32\x01\
\x01\x66\x89\x7C\x24\x02\x68\x75\x73\x65\
\x72\x54\xFF\xD0\x68\x6F\x78\x41\x01\x8B\
\xDF\x88\x5C\x24\x03\x68\x61\x67\x65\x42\
\x68\x4D\x65\x73\x73\x54\x50\xFF\x54\x24\
\x2C\x57\x68\x4F\x5F\x6F\x21\x8B\xDC\x57\
\x53\x53\x57\xFF\xD0\x68\x65\x73\x73\x01\
\x8B\xDF\x88\x5C\x24\x03\x68\x50\x72\x6F\
\x63\x68\x45\x78\x69\x74\x54\xFF\x74\x24\
\x40\xFF\x54\x24\x40\x57\xFF\xD0"
shellcode = c_char_p(shellcode_data)
function = cast(shellcode, CFUNCTYPE(None))
function()