Which of these items can safely be assumed to be defined in any practically-usable platform ABI?
Value of CHAR_BIT
Size, alignment requirements and object representation of:
void*
, size_t
, ptrdiff_t
unsigned char
and signed char
intptr_t
and uintptr_t
float
, double
and long double
short
and long long
int
and long
(but here I expect a "no")Object representation of a null object pointer
Object representation of a null function pointer
For example, if I have a library (compiled by an unknown, but ABI-conforming compiler) which publishes this function:
void* foo(void *bar, size_t baz, void* (*qux)());
can I assume to be able to safely call it in my program regardless of the compiler I use?
Or, taken the other way round, if I am writing a library, is there a set of types such that if I limit the library's public interface to this set, it will be guaranteed to be usable on all platforms where it builds?
The C standard contains an entire section in the appendix summarizing just that:
J.3 Implementation-defined behavior
A completely random subset:
The number of bits in a byte
Which of signed char
and unsigned char
is the same as char
The text encodings for multibyte and wide strings
Signed integer representation
The result of converting a pointer to an integer and vice versa (6.3.2.3). Note that this means any pointer, not just object pointers.
Update: To address your question about ABIs: An ABI (application binary interface) is not a standardized concept, and it isn't said anywhere that an implementation must even specify an ABI. The ingredients of an ABI are partly the implementation-defined behaviour of the language (though not all of it; e.g. signed-to-unsigned conversion is implementation defined, but not part of an ABI), and most of the implementation-defined aspects of the language are dictated by the hardware (e.g. signed integer representation, floating point representation, size of pointers).
However, more important aspects of an ABI are things like how function calls work, i.e. where the arguments are stored, who's responsible for cleaning up the memory, etc. It is crucial for two compilers to agree on those conventions in order for their code to be binarily compatible.
In practice, an ABI is usually the result of an implementation. Once the compiler is complete, it determines -- by virtue of its implementation -- an ABI. It may document this ABI, and other compilers, and future versions of the same compiler, may like to stick to those conventions. For C implementations on x86, this has worked rather well and there are only a few, usually well documented, free parameters that need to be communicated for code to be interoperable. But for other languages, most notably C++, you have a completely different picture: There is nothing coming near a standard ABI for C++ at all. Microsoft's compiler breaks the C++ ABI with every release. GCC tries hard to maintain ABI compatibility across versions and uses the published Itanium ABI (ironically for a now dead architecture). Other compilers may do their own, completely different thing. (And then you have of course issues with C++ standard library implementations, e.g. does your string
contain one, two, or three pointers, and in which order?)
To summarize: many aspects of a compiler's ABI, especially pertaining to C, are dictated by the hardware architecture. Different C compilers for the same hardware ought to produce compatible binary code as long as certain aspects like function calling conventions are communicated properly. However, for higher-level languages all bets are off, and whether two different compilers can produce interoperable code has to be decided on a case-by-case basis.