Search code examples
cclangmmx

clang: MMX intrinsics break long double


I have the following piece of code that crashes on assert(!isnan(x)) when compiled with clang. If I compile using -DWITH_MMX=0, it runs fine. I observe same behaviour on Compiler Explorer and locally on my macOS.

I don't understand why the assignment of 42.0 to a long double variable produces NaN when I the program uses MMX intrinsics.

I've tried to use Compiler Explorer to figure this out, but I don't get it. Could someone help me understand what's happening here?

// cc -O0 -DWITH_MMX=1 -o nan nan.c
#include <mmintrin.h>
#include <stdio.h>
#include <assert.h>
#include <math.h>
int main()
{
#if WITH_MMX
    __m64 a = _m_from_int(4);
    __m64 b = _m_from_int(8);
    __m64 ab = _m_paddb(a, b);
    int c = _m_to_int(ab);
    assert(c == 12);
#endif
    long double x = 42.0L;
    assert(!isnan(x)); // 42.0 should not be NaN
    printf("done\n");
}

Solution

  • The legacy floating-point instructions and the MMX instructions use the same registers, and they cannot be used for both at the same time. Using any MMX instruction other than emms marks all of the floating-point registers as in use (called “valid,” meaning that, in ordinary floating-point use, the register contains some floating-point value). This interferes with the ordinary floating-point instructions generated by the compiler, resulting in NaNs.

    In assembly code, one would switch from using the registers for MMX to using the registers for legacy floating-point by executing an emms instruction, for which there is the “intrinsic” _mm_empty().

    However, the compiler might not expect the floating-point registers to be empty; your MMX code does not necessarily occur at a point where the compiler has emptied the floating-point registers, so executing emms at the end of your MMX code might not reproduce the state the compiler expects. To do that, you may need to save and restore the FPU state with the fxsave and fxrstor instructions. I would not even guarantee this to work without investigating the compiler further to be sure it will group the save instruction, the MMX instructions, and the restore instruction together.