Question: Why does the program throw a violation executing location exception after the shellcode accomplishes it's goal successfully?
Description: My objective was to load and unload a DLL into the current program using x86 shellcode that calls Windows API functions. While the program accomplishes this goal successfully, Visual Studio then tells me there is a violation executing a location. I know the program executes successfully because the test DLL file prints when it attaches and detaches. Another important detail to note is this only occurs when calling the unloading function, the loading function works with absolutely no problems. (I am doing this on Windows 10 in Visual Studio 2019 using C++20 if that's important)
I'm aware that the shellcode does not properly set up the stack frame, but I made sure ESP is set back to normal before returning execution to the callee function. I saved EAX and set it back to normal in the unloading function too. I made this test program with the end goal to produce shellcode I can use for a remote thread context patching method in a dll injection program I am working on. Also I have verified the offsets used to find the return addresses multiple times. Any help is appreciated, thank you!
Here is the console output.
Attached! DLLMain at 0x79EF134D
Detached!
Here is the exception thrown.
Exception thrown at 0x9269D814 in Shellcode DLL
Loading.exe: 0xC0000005: Access violation executing
location 0x9269D814.
Here is the main file, it is only around 120 lines.
const dword follow_relative_jump(const pbyte pointer)
{
if (pointer)
{
if (pointer[0] == 0xE9 || pointer[0] == 0xEB)
{
return reinterpret_cast<dword>(pointer + 5 + reinterpret_cast<psdword>(pointer + 1)[0]);
}
}
return reinterpret_cast<dword>(pointer);
}
void load_dll(const dword path_address)
{
/*
68 90 90 90 90 -> push 0x???????? (return address buffer)
68 90 90 90 90 -> push 0x???????? (LoadLibraryA() address buffer)
68 90 90 90 90 -> push 0x???????? (DLL path address buffer)
FF 54 24 04 -> call [esp + 4] (calling LoadLibraryA())
83 C4 08 -> add esp, 8 (cleaning up the stack, except for return address)
C3 -> ret (return to return address that was pushed first, it should pop it off the stack and return ESP to normal)
*/
std::vector<byte> shellcode = {
0x68, 0x90, 0x90, 0x90, 0x90,
0x68, 0x90, 0x90, 0x90, 0x90,
0x68, 0x90, 0x90, 0x90, 0x90,
0xFF, 0x54, 0x24, 0x04,
0x83, 0xC4, 0x08,
0xC3
};
// Offset is the distance from the function prologue to the next instruction after the call to load_dll()
reinterpret_cast<pdword>(shellcode.data() + 1)[0] = follow_relative_jump(reinterpret_cast<pbyte>(&load_dll)) + 0x22C;
reinterpret_cast<pdword>(shellcode.data() + 6)[0] = reinterpret_cast<dword>(&LoadLibraryA);
reinterpret_cast<pdword>(shellcode.data() + 11)[0] = path_address;
if (const auto allocation = VirtualAlloc(NULL, shellcode.size(), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE))
{
memcpy(allocation, shellcode.data(), shellcode.size());
reinterpret_cast<void(__cdecl*)()>(allocation)();
VirtualFree(allocation, shellcode.size(), MEM_FREE);
}
}
void unload_dll(const dword path_address)
{
/*
68 90 90 90 90 -> push 0x???????? (return address buffer)
50 -> push eax (save EAX so we can set it back later)
68 90 90 90 90 -> push 0x???????? (GetModuleHandleA() address buffer)
68 90 90 90 90 -> push 0x???????? (DLL path address buffer)
FF 54 24 04 -> call [esp + 4] (calling GetModuleHandleA())
83 C4 08 -> add esp, 8 (clean up the stack, except for return address and saved EAX)
68 90 90 90 90 -> push 0x???????? (FreeLibrary() address buffer)
50 -> push eax (Handle to module returned by GetModuleHandleA() in EAX)
FF 54 24 04 -> call [esp + 4] (calling FreeLibrary())
83 C4 08 -> add esp, 8 (clean up stack, except for return address and saved EAX)
58 -> pop eax (set back EAX to what it was before)
C3 -> ret (return to return address that was pushed first, it should pop it off the stack and return ESP to normal)
*/
std::vector<byte> shellcode = {
0x68, 0x90, 0x90, 0x90, 0x90,
0x50,
0x68, 0x90, 0x90, 0x90, 0x90,
0x68, 0x90, 0x90, 0x90, 0x90,
0xFF, 0x54, 0x24, 0x04,
0x83, 0xC4, 0x08,
0x68, 0x90, 0x90, 0x90, 0x90,
0x50,
0xFF, 0x54, 0x24, 0x04,
0x83, 0xC4, 0x08,
0x58,
0xC3
};
// Offset is the distance from the function prologue to the next instruction after the call to unload_dll()
reinterpret_cast<pdword>(shellcode.data() + 1)[0] = follow_relative_jump(reinterpret_cast<pbyte>(&unload_dll)) + 0x2AF;
reinterpret_cast<pdword>(shellcode.data() + 7)[0] = reinterpret_cast<dword>(&GetModuleHandleA);
reinterpret_cast<pdword>(shellcode.data() + 12)[0] = path_address;
reinterpret_cast<pdword>(shellcode.data() + 24)[0] = reinterpret_cast<dword>(&FreeLibrary);
if (const auto allocation = VirtualAlloc(NULL, shellcode.size(), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE))
{
memcpy(allocation, shellcode.data(), shellcode.size());
reinterpret_cast<void(__cdecl*)()>(allocation)();
VirtualFree(allocation, shellcode.size(), MEM_FREE);
}
}
int main()
{
const char* path = "C:\\Users\\maxbd\\Desktop\\test.dll";
load_dll(reinterpret_cast<dword>(path));
unload_dll(reinterpret_cast<dword>(path));
static_cast<void>(std::getchar());
return 0;
}
I didn't consider the calling conventions of the functions I was trying to call and how they are expected to operate. The Windows API functions use __stdcall
which pops the function arguments off the stack within the function. So I should only be popping off the function address I pushed and not the function arguments. Thanks for the information in your comment, Jester.
Also, I had to change the 0xC3
return instruction to 0xC2 0x04 0x00
so that it would pop the return address off the stack. I thought that the normal 0xC3
return would do that for me, but apparently it didn't. Or at least it doesn't in this case for some reason. Visual Studio throws an exception about ESP being incorrect if I don't pop it off manually. If I do, it works perfectly in both loading and unloading the DLL.
I also completely forgot that since this is a test program I am calling the shellcode as a __cdecl
function using a function pointer instead of hijacking the execution of a remote thread and modifying EIP, so the call
instruction is being used so I have no reason to push the return address manually. I'm assuming my voluntary failure to properly set up the stack frame is the reason for the return behavior being abnormal, as the return address should be above EBP. Because call
is used, the return address is pushed twice, so the return instruction that pops off a dword after returning is needed to get rid of the automatically pushed return address. When I apply the shellcode to my actual program I will try a relative jump instead of a return, it makes more sense in this case and it's neater.
I wouldn't be surprised if I'm misunderstanding this solution, but it seems to work so I will consider this solved.