sorry for my weak english
im trying to improve my asm abilities and i have found easy entry point to working on it by using machine code routines from c code
i am using it in such way
char asmRoutineData2[] =
{
0xC8, 0x00, 0x00, 0x00, // enter 0, 0
0xB8, 0xff, 0x00 ,0x00 ,0x00, // mov eax, 65538
0xC9, // leave
0xc3 // ret
};
int (*asmRoutine)(void) = (int (*)(void)) asmRoutineData;
int ret = asmRoutine();
and it works pretty excellent for some routines - such as above
some other do not work:
1)i got trouble and I cannot obtain value passed by stack
such procedure
char asmRoutine_body[] =
{
0xC8, 0x00, 0x00, 0x00, //enter
0x8B, 0x45, 0x08, // mov eax, [ebp+8]
0xC9, //leave
0xC3
};
and
int ( *asmRoutine)(int, int, int) = ( int (*)(int, int, int)) asmRoutine_body;
int ret = asmRoutine(77,66,55);
should work as far as i know but it does not
i looked up in asm generated by kompiler and it seem to be correct
mov eax,offset _asmRoutineData
push 55
push 66
push 77
call eax
add esp,12
_asmRoutineData label byte
db 200 //enter
db 0
db 0
db 0
db 139 // mov eax, dword [ebp+8H] ; 8B. 45, 08
db 69
db 8
db 201 //leave
db 195 //ret
do not know what is wrong (returns other values than my expected 77 (or 66 or 55 for ebp+12 ebp+16)
2) second trouble is that this way of calling machine code works for arithmetic instructions form me but it crashes aplication (some way of system exception) on fpu or sse instructions
why? and what i should do to make it work for me (i would love write assembly routines such way)
fir
//EDIT
this is sse routine that should get a float4* vector a and b make dot product and put result into float4* c (float4 is a struct or table of 4 floats)
(strange couse it should anly get two vectors and return a float by eax but i got if form internet possibly and got no moment to test and rewrite it)
/*
enter 0, 0 ; 0034 _ C8, 0000, 00
mov eax, dword [ebp+8H] ; 0038 _ 8B. 45, 08
mov ebx, dword [ebp+0CH] ; 003B _ 8B. 5D, 0C
mov ecx, dword [ebp+10H] ; 003E _ 8B. 4D, 10
movups xmm0, oword [eax] ; 0041 _ 0F 10. 00
movups xmm1, oword [ebx] ; 0044 _ 0F 10. 0B
mulps xmm0, xmm1 ; 0047 _ 0F 59. C1
movhlps xmm1, xmm0 ; 004A _ 0F 12. C8
addps xmm1, xmm0 ; 004D _ 0F 58. C8
movaps xmm0, xmm1 ; 0050 _ 0F 28. C1
shufps xmm1, xmm1, 1 ; 0053 _ 0F C6. C9, 01
addss xmm0, xmm1 ; 0057 _ F3: 0F 58. C1
movss dword [ecx], xmm0 ; 005B _ F3: 0F 11. 01
leave ; 005F _ C9
ret ; 0060 _ C3
*/
char asmDot_body[] =
{
0xC8, 0x00, 0x00, 0x00,
0x8B, 0x45, 0x08,
0x8B, 0x5D, 0x0C,
0x8B, 0x4D, 0x10,
0x0F, 0x10, 0x00,
0x0F, 0x10, 0x0B,
0x0F, 0x59, 0xC1,
0x0F, 0x12, 0xC8,
0x0F, 0x58, 0xC8,
0x0F, 0x28, 0xC1,
0x0F, 0xC6, 0xC9, 0x01,
0xF3, 0x0F, 0x58, 0xC1,
0xF3, 0x0F, 0x11, 0x01,
0xC9,
0xC3
};
void (*asmAddSSE)(float4*, float4*, float4*) = (void (*)(float4*, float4*, float4*)) asmDot_body;
float4 a = {1,2,1,0};
float4 b = {1,2,3,0};
float4 c = {0,0,0,0};
asmAddSSE(&a,&b,&c);
//EDIT L8R
FOUND IT! and it works extremally cool & great (passing arguments and also fpu and even sse) Im happy
tnx necrolis for stating that it was working on yr system,
I began to try with compiler switches tu set up alignment and also disable some and it was -pr (use fastcall ) that was enebled and i should to turn it off
(got two compile.bat's - one for normal compilation and second for olso generating assembly and no -pr switch in the second so asm code i wrote abowe is okay - but my normal compile.bat generated fastcall calls ant it goes bum!)
Your very first problem is you assume that the code is executable, if you are lucky, DEP is off and you can execute code from your stack, but generally (99.99% of the time) you need to allocate executable memory to do this. Secondly, writing out pure machine code like you are doing is horrible, and prone to bugs, if you feel you cannot use the inline assembler provided by your compiler, use something like AsmJIT instead (or any other in-memory assembler).
Your code however works fine however (when called using __cdecl
), when once those issues are addressed, its still unsafe though. (I ran it and got the expected result of 77, after putting it in executable memory). You will likely run into problems down the road with fixing up of virtual and absolute calls/long jumps, which will make this ever more complex.
Your crashes on FPU and SSE instructions is mostly likely alignment problems, but its impossible to tell without a system code, your assembly, or what CPU you are using, and in cases like this, its best to use a debugger, such as ollydbg (which is free) and step through the code.
the semi-corrected code:
static char asmRoutine_body[] =
{
0xC8, 0x00, 0x00, 0x00, //enter
0x8B, 0x45, 0x08, // mov eax, [ebp+8]
0xC9, //leave
0xC3
};
void* p = (void*)VirtualAlloc(NULL,sizeof(asmRoutine_body),MEM_COMMIT,PAGE_EXECUTE_READWRITE);
memcpy(p,asmRoutine_body,sizeof(asmRoutine_body));
int ( *asmRoutine)(int, int, int) = ( int (*)(int, int, int))p;
int ret = asmRoutine(77,66,55);
VirtualFree(p,sizeof(asmRoutine_body),MEM_RELEASE);
printf("%d\n",ret);
outputs: 77