Search code examples
x86cpuidhammingweight

Why shouldn't I catch Undefined Instruction exception instead of using CPUID?


Assume I want to use an instruction that may be not available. And this instruction is not of those transparent fallback, it is undefined instruction when it is not available. Say it is popcnt for example.

Can I instead of using cpuid just try and call it?

If it fails, I'll catch the exception, and save this information in a bool variable, and will use a different branch further on.

Sure there would be performance penalty, but just once. Any additional disadvantages of this approach?


Solution

  • One major difficulty is giving correct execution for that first call.

    Once you solve that by figuring out which instruction faulted and emulating it and modifying the saved task state, the problem becomes performance of a loop containing popcnt that runs 1 million iterations after you optimistically dispatched to the popcnt version of that loop.

    If your whole code was written in asm (or compilers could make this code for you), it's maybe plausible but hard for a signal handler to collect all necessary state and resume execution in the other version of such a loop.

    (GNU/Linux signal handlers get a non-standard with to the saved register state of the thread they're running in, so you could in theory do this there.)

    Presumably this is only relevant for ahead-of-time compilation; if you're JITing you should just check CPUID ahead of time instead of building exception-handling paths.


    Being able to dispatch efficiently means your code is probably already written with function pointers for functions that are multiversioned.

    So the only saving here is one simple init function that your program runs once, which runs CPUID a couple times and settings all the function pointers. Doing it later lazily as needed means more cache misses unless a lot of the function pointers go unused. e.g. large-program --help.

    The code for these exception / signal handlers probably wouldn't be smaller than a simple init functions. Interesting idea but overall I don't see any meaningfull benefit.


    You also need to know which instruction faulted, if your program has multiple CPU features that it uses.

    If you're emulating or something, you'd need to be checking that to see if it's one of your expected instructions that might raise #UD execptions / SIGILL signals. e.g. by checking the machine code at the fault address.

    But if you were instead having functions keep track of which optimistic dispatch they just did (so they could detect if it didn't work), you'd need to set a variable before every dispatch so that's actually extra overhead.