My platform is x86_64 + Windows 10 + Cygwin. My compiler is x86_64-w64-mingw32-gcc
.
For some reason, I had to compile my program with -mabi=sysv
option, and I would like to avoid the default -mabi=ms
option if it is possible at all.
The program compiled successfully. But when it calls library functions like printf
, it segfaults. The reason is that the library functions reside in msvcrt.dll
, which was probably prebuilt with a calling convention other than -mabi=sysv
.
So, is there a way to install libraries compiled with -mabi=sysv
in Cygwin?
You generally want to avoid this, e.g. by using macros in your asm to adapt it between calling conventions. Agner Fog's calling convention guide has some suggestions, e.g. #ifdef
or %if
some extra mov
instructions ahead of the function to put args in the registers the rest of the function was written for. (And then don't use the red zone or shadow space, lowest common denominator, if your function can still work efficiently that way.)
IDK if the same .cfi
directives could create correct stack-unwind metadata for both MS and SysV versions, but if you just use your SysV assembly you're unlikely to have SEH-safe code that fully complies with the Windows ABI.
If you declare every hand-written asm function's prototype with __attribute__((sysv_abi))
, GCC will make code that calls it correctly.
The GCC manual also says:
Note, the
ms_abi
attribute for Microsoft Windows 64-bit targets currently requires the-maccumulate-outgoing-args
option.
Of course, any callback function-pointers you pass into the asm functions also need to be __attribute__((ms_abi))
, so if you want to pass the address of any Windows library function, you'll need to write your own wrapper for it which uses that in the definition.
This comes at some performance overhead, because the SysV ABI allows clobbering all XMM regs, but MS-ABI doesn't, so every function that calls any SysV-ABI function needs to save/restore xmm6-15. This also bloats those call-sites, which can be mitigated by having GCC use a helper function to save / restore those mismatched registers:
-mcall-ms2sysv-xlogues
Due to differences in 64-bit ABIs, any Microsoft ABI function that calls a System V ABI function must consider RSI, RDI and XMM6-15 as clobbered. By default, the code for saving and restoring these registers is emitted inline, resulting in fairly lengthy prologues and epilogues. Using -mcall-ms2sysv-xlogues emits prologues and epilogues that use stubs in the static portion of libgcc to perform these saves and restores, thus reducing function size at the cost of a few extra instructions.
#define SYSV __attribute__((sysv_abi))
SYSV int foo(int x, double);
// no attribute so this function is the default MS-ABI
int bar(int *p, double d) {
*p = 0; // incoming arg in RCX
int tmp = foo(p[12], d);
*p = tmp;
return tmp;
}
Build with -O3 -mabi=ms -Wall -maccumulate-outgoing-args
with Linux GCC11.2. -mabi=ms
is a rough simulation of building with Cygwin or MinGW.
bar:
push rdi
movapd xmm0, xmm1 # SysV wants first FP arg in XMM0
push rsi # save RDI/RSI, call-preserved in MS-ABI
push rbx
mov rbx, rcx
sub rsp, 160 # shadow space + xmm spill space
mov DWORD PTR [rcx], 0 # deref pointer arg
mov edi, DWORD PTR [rcx+48] # load a new 1st arg
movaps XMMWORD PTR [rsp], xmm6 # save the MS-abi call-preserved regs
movaps XMMWORD PTR [rsp+16], xmm7
movaps XMMWORD PTR [rsp+32], xmm8
movaps XMMWORD PTR [rsp+48], xmm9
movaps XMMWORD PTR [rsp+64], xmm10
movaps XMMWORD PTR [rsp+80], xmm11
movaps XMMWORD PTR [rsp+96], xmm12
movaps XMMWORD PTR [rsp+112], xmm13
movaps XMMWORD PTR [rsp+128], xmm14
movaps XMMWORD PTR [rsp+144], xmm15
call foo
movaps xmm6, XMMWORD PTR [rsp]
movaps xmm7, XMMWORD PTR [rsp+16]
movaps xmm8, XMMWORD PTR [rsp+32]
movaps xmm9, XMMWORD PTR [rsp+48]
mov DWORD PTR [rbx], eax
movaps xmm10, XMMWORD PTR [rsp+64]
movaps xmm11, XMMWORD PTR [rsp+80]
movaps xmm12, XMMWORD PTR [rsp+96]
movaps xmm13, XMMWORD PTR [rsp+112]
movaps xmm14, XMMWORD PTR [rsp+128]
movaps xmm15, XMMWORD PTR [rsp+144]
add rsp, 160
pop rbx
pop rsi
pop rdi
ret
Note that all this XMM save/restore would still happen even if there were no double
or float
args; the caller has to assume the callee might be using SIMD or FP internally. Fortunately only the low xmm part of high vector regs are call-preserved, even with AVX enabled, so the stack space usage doesn't get worse.
If I'd used a __m128d
arg, x64 fastcall (the default with ms_abi
) would pass it by reference, pointed to by RDX. vectorcall
would have passed it in XMM1 like a double
(not xmm0 because arg-register numbering isn't independent between int and FP in MS-ABI). I don't know how to use x64 vectorcall with GCC. __attribute__((vectorcall))
is ignored if used on bar
.
vs. with both parts compiled as MS-ABI (commenting out the SYSV on the prototype):
bar:
push rbx # save a call-preserved reg (and realign the stack)
mov rbx, rcx # save incoming arg around function call
sub rsp, 32 # reserve shadow space
mov DWORD PTR [rcx], 0 # deref the arg
mov ecx, DWORD PTR [rcx+48] # load a new 1st arg
# 2nd arg still in XMM1
call foo
mov DWORD PTR [rbx], eax # save return value
add rsp, 32
pop rbx
ret # and return with it in EAX
-mabi=ms
, but respects __attribute__((ms_abi))
So a Windows build of clang that defaults to MS-ABI can probably use __attribute__((sysv))
.