The availability of some platform-specific features, such as SSE or AVX, can be determined during runtime, which is very useful, if do not want to compile and ship different objects for the different features.
The following code for example allows me to check for AVX and compiles with gcc, which provides the cpuid.h
header:
#include "stdbool.h"
#include "cpuid.h"
bool has_avx(void)
{
uint32_t eax, ebx, ecx, edx;
__get_cpuid(1, &eax, &ebx, &ecx, &edx);
return ecx & bit_AVX;
}
Instead of littering the code with runtime checks, such as the above, that repeatedly perform the checks, are slow and introduce branching (the checks could be cached to reduce the overhead, but there would be branching nonetheless), I figured that I could use the infrastructure provided by the dynamic linker/loader.
Calls to functions with external linkage on platforms with ELF are already indirect and go through the Procedural Linkage Table/PLT and Global Offset Table/GOT.
Suppose there are two internal functions, a basic _do_something_basic
that always and a somehow optimized version _do_something_avx
, which uses AVX. I could export a generic do_something
symbol, and alias it to the basic add:
static void _do_something_basic(…) {
// Basic implementation
}
static void _do_something_avx(…) {
// Optimized implementation using AVX
}
void do_something(…) __attribute__((alias("_do_something_basic")));
During load-time of my library or program, I would like to check the availability of AVX once using has_avx
and depending on the result of the check point the do_something
symbol to _do_something_avx
.
Even better would be, if I could point the initial version of the do_something
symbol to a self-modifying function that checks the availability of AVX using has_avx
and replaces itself with _do_something_basic
or _do_something_avx
.
In theory this should be possible, but how can I find the location of PLT/GOT programmatically? Is there an ABI/API provided the ELF loader, e.g. ld-linux.so.2, that I could use for this? Do I need a linker script to obtain the PLT/GOT location? What about security considerations, can I even write to the PLT/GOT, if I obtain a pointer to it?
Maybe some project has done this or something very similar already.
I'm fully aware, that the solution would be highly platform-specific, but since I'm already having to deal with low-level platform-specific details, like features of the instruction set, this is fine.
As others have suggested you can go with platform-specific versions of libs. Or if you are ok with sticking to Linux, you can use the (relatively) new IFUNC relocations which do exactly what you want.
EDIT: As noted by Sebastian, IFUNCs seem to also be supported by other platforms (FreeBSD, Android). Note however, that the feature is not that widely used so may have some rough edges.