I am writing a C interface for CPU's cpuid
instruction. I'm just doing this as kind of an exercise: I don't want to use compiler-depended headers such as cpuid.h
for GCC or intrin.h
for MSVC. Also, I'm aware that using C inline assembly would be a better choice, since it avoids thinking about calling conventions (see this implementation): I'd just have to think about different compiler's syntaxes. However I'd like to start practicing a bit with integrating assembly and C.
Given that I now have to write a different assembly implementation for each major assembler (I was thinking of GAS, MASM and NASM) and for each of them both for x86-64 and x86, how should I handle the fact that different machines and C compilers may use different calling conventions?
If you really want to write, as just an exercise, an assembly function that "conforms" to all the common calling conventions for x86_64 (I know only the Windows one and the System V one), without relying on attributes or compiler flags to force the calling convention, let's take a look at what's common.
The Windows GPR passing order is rcx
, rdx
, r8
, r9
. The System V passing order is rdi
, rsi
, rdx
, rcx
, r8
, r9
. In both cases, rax
holds the return value if it fits and is a piece of POD. Technically speaking, you can get away with a "polyglot" called function if it (0) saves the union of what each ABI considers non-volatile, and (1) returns something that can fit in a single register, and (2) takes no more than 2 GPR arguments, because overlap would happen past that. To be absolutely generic, you could make it take a single pointer to some structure that would hold whatever arbitrary return data you want.
So now our arguments will come through either rcx
and rdx
or rdi
and rsi
. How do you tell which will contain the arguments? I'm actually not sure of a good way. Maybe what you could do instead is have a wrapper that puts the arguments in the right spot, and have your actual function take "padding" arguments, so that your arguments always land in rcx
and rdx
. You could technically expand to r8
and r9
this way.
#ifdef _WIN32
#define CPUID(information) cpuid(information, NULL, NULL, NULL)
#else
#define CPUID(information) cpuid(NULL, NULL, NULL, information)
#endif
// d duplicates a
// c duplicates b
no_more_than_64_bits_t cpuid(void * a, void * b, void * c, void * d);
Then, in your assembly, save the union of what each ABI considers non-volatile, do your thing, put whatever information you want in the structure to which rcx
points, and restore.