I am calling a NASM 64-bit DLL from ctypes. The dll takes five input parameters. In the Windows calling convention, the first four are passed in rcx, rdx, r8 and r9, and the fifth is passed on the stack.
The x64 calling convention documentation says
Any parameters beyond the first four must be stored on the stack, above the shadow store for the first four, prior to the call.
So therefore the value can't be accessed with a pop, and I think it should be accessed with RSP. If the fifth (and later) parameters are above the shadow store, then I guessed it would be RSP minus 40 (mov rax,[rsp-40]), but it's not.
I tried "walking the stack" which means I tried at rsp-0, rsp-8, rsp-16, etc, all the way to rsp-56, but it did not return the value that I had passed as the fifth parameter (a single 64-bit double).
According to https://learn.microsoft.com/en-us/cpp/build/stack-allocation, the stack layout on entry is return address, rcx, rdx, r8, r9, and the stack parameter area, so I would expect to find my value at rsp-48, but it's not there, nor is it at rsp-56.
So my question is: how do I access a parameter passed on the stack on entry to the DLL in the Windows calling convention?
EDIT: Here is the relevant ctypes code:
hDLL = ctypes.WinDLL("C:/Test_Projects/MultiCompare/py_descent.dll")
CallName = hDLL.Main_Entry_fn
CallName.argtypes = [ctypes.POINTER(ctypes.c_double),ctypes.POINTER (ctypes.c_double),ctypes.c_double,ctypes.POINTER(ctypes.c_double),ctypes.c_double]
CallName.restype = ctypes.c_double
ret_ptr = CallName(CA_x,CA_d,CA_mu,length_array_out,CA_N_epochs)
Data types:
CA_x: pointer to double(float) array
CA_d: pointer to double(float) array
CA_mu: double
length_array_out: pointer to double(float) array
CA_N_epochs: double
Here is the DLL entry point where the vars are retrieved. I always push rdi and rbp on entry, so I take parameters passed on the stack first before I do that to prevent stack misalignment:
Main_Entry_fn:
; First the stack parameters
movsd xmm0,[rsp+40]
movsd [N_epochs],xmm0
; End stack section
push rdi
push rbp
mov [x_ptr],rcx
mov [d_ptr],rdx
movsd [mu],xmm2
mov [data_master_ptr],r9
; Now assign lengths
; (this part intentionally omitted for brevity)
call py_descent_fn
exit_label_for_Main_Entry_fn:
pop rbp
pop rdi
ret
The links provided were relatively clear, but if things are ambiguous I resort to compiling a C example and looking at the assembly. The result is at the end of this post. The assignments were:
[RSP] Return address
[RSP+8] ECX int a (XMM0 unused)
[RSP+16] EDX int b (XMM1 unused)
[RSP+24] XMM2 double c (R8 unused)
[RSP+32] R9 int d (XMM3 unused)
[RSP+40] double e
[RSP+48] int f
The first four parameters are in registers. The first parameter is R8/XMM0 depending on the type. Second is R9/XMM1, etc. The fifth and later parameters ([RSP+40] and [RSP+48] in this case) are always on the stack. The four quadwords at [RSP+8] through [RSP+32] are the shadow space for the registers. I compiled with no optimization below so the function immediately spilled the registers to the shadow space.
Hope this clears it up.
C example
int func(int a, int b, double c, int d, double e, int f)
{
return (int)(a+b+c+d+e+f);
}
int main()
{
func(1,2,1.1,4,5.5,6);
return 0;
}
Assembly generated:
; Listing generated by Microsoft (R) Optimizing Compiler Version 19.00.24215.1
include listing.inc
INCLUDELIB LIBCMT
INCLUDELIB OLDNAMES
PUBLIC func
PUBLIC main
PUBLIC __real@3ff199999999999a
PUBLIC __real@4016000000000000
EXTRN _fltused:DWORD
pdata SEGMENT
$pdata$main DD imagerel $LN3
DD imagerel $LN3+62
DD imagerel $unwind$main
pdata ENDS
; COMDAT __real@4016000000000000
CONST SEGMENT
__real@4016000000000000 DQ 04016000000000000r ; 5.5
CONST ENDS
; COMDAT __real@3ff199999999999a
CONST SEGMENT
__real@3ff199999999999a DQ 03ff199999999999ar ; 1.1
CONST ENDS
xdata SEGMENT
$unwind$main DD 010401H
DD 06204H
xdata ENDS
; Function compile flags: /Odtp
; File c:\users\metolone\x.c
_TEXT SEGMENT
main PROC
; 7 : {
$LN3:
00000 48 83 ec 38 sub rsp, 56 ; 00000038H
; 8 : func(1,2,1.1,4,5.5,6);
00004 c7 44 24 28 06
00 00 00 mov DWORD PTR [rsp+40], 6
0000c f2 0f 10 05 00
00 00 00 movsd xmm0, QWORD PTR __real@4016000000000000
00014 f2 0f 11 44 24
20 movsd QWORD PTR [rsp+32], xmm0
0001a 41 b9 04 00 00
00 mov r9d, 4
00020 f2 0f 10 15 00
00 00 00 movsd xmm2, QWORD PTR __real@3ff199999999999a
00028 ba 02 00 00 00 mov edx, 2
0002d b9 01 00 00 00 mov ecx, 1
00032 e8 00 00 00 00 call func
; 9 : return 0;
00037 33 c0 xor eax, eax
; 10 : }
00039 48 83 c4 38 add rsp, 56 ; 00000038H
0003d c3 ret 0
main ENDP
_TEXT ENDS
; Function compile flags: /Odtp
; File c:\users\metolone\x.c
_TEXT SEGMENT
a$ = 8
b$ = 16
c$ = 24
d$ = 32
e$ = 40
f$ = 48
func PROC
; 2 : {
00000 44 89 4c 24 20 mov DWORD PTR [rsp+32], r9d
00005 f2 0f 11 54 24
18 movsd QWORD PTR [rsp+24], xmm2
0000b 89 54 24 10 mov DWORD PTR [rsp+16], edx
0000f 89 4c 24 08 mov DWORD PTR [rsp+8], ecx
; 3 : return (int)(a+b+c+d+e+f);
00013 8b 44 24 10 mov eax, DWORD PTR b$[rsp]
00017 8b 4c 24 08 mov ecx, DWORD PTR a$[rsp]
0001b 03 c8 add ecx, eax
0001d 8b c1 mov eax, ecx
0001f f2 0f 2a c0 cvtsi2sd xmm0, eax
00023 f2 0f 58 44 24
18 addsd xmm0, QWORD PTR c$[rsp]
00029 f2 0f 2a 4c 24
20 cvtsi2sd xmm1, DWORD PTR d$[rsp]
0002f f2 0f 58 c1 addsd xmm0, xmm1
00033 f2 0f 58 44 24
28 addsd xmm0, QWORD PTR e$[rsp]
00039 f2 0f 2a 4c 24
30 cvtsi2sd xmm1, DWORD PTR f$[rsp]
0003f f2 0f 58 c1 addsd xmm0, xmm1
00043 f2 0f 2c c0 cvttsd2si eax, xmm0
; 4 : }
00047 c3 ret 0
func ENDP
_TEXT ENDS
END