Search code examples
gccintrinsicsinstruction-setcompiler-flagsavx512

Illegal Instruction with mm_cmpeq_epi8_mask


Im trying to run code similar to the following

#include <immintrin.h>
void foo() {
    __m128i a = _mm_set_epi8 (0,0,6,5,4,3,2,1,8,7,6,5,4,3,2,1);
    __m128i b = _mm_set_epi8 (0,0,0,0,0,0,0,1,8,7,6,5,4,3,2,1);
    __mmask16 m = _mm_cmpeq_epi8_mask(a,b); // supposedly requires avx512vl and avx512bw
    std::cout<<m<<std::endl;
}
void bar() {
    int dataa[8] = {1,0,1,0,1,0,1,0};
    __m256i points = _mm256_lddqu_si256((__m256i *)&dataa[0]); // requires just mavx
    (void)points;
}

However, I keep running into the error Illegal instruction (core dumped)

I compile the code with

g++ -std=c++11 -march=broadwell -mavx -mavx512vl -mavx512bw tests.cpp

According to Intel's intrinsics documentation, these flags should be sufficient to run both foo and bar. However, when either foo or bar is run, I get the same error message.

If I remove foo, however, and compile WITHOUT -mavx512vl, I can run bar smoothly.

I already checked that my cpu supports the mno-avx512vl and mno-avx512bw flags so it should support mavx512vl and mavx512bw right?

What flags must I include to run both functions? Or am I missing something else?


Solution

  • Compile with gcc -march=native. If you get compile errors, your source tried to use something your CPU doesn't support.

    Related: Getting Illegal Instruction while running a basic Avx512 code


    I already checked that my cpu supports the mno-avx512vl and mno-avx512bw flags so it should support mavx512vl and mavx512bw right?

    That's the opposite of how GCC options work.

    -mno-avx512vl disables -mavx512vl if any earlier option (like -march=skylake-avx512 or -mavx512vl on its own) had set it.

    -march=broadwell doesn't enable AVX512 instructions because Broadwell CPUs can't run them natively. So -mno-avx512vl has exactly zero effect at the end of g++ -std=c++11 -march=broadwell -mavx ...

    Many options have long names starting with ‘-f’ or with ‘-W’—for example, -fmove-loop-invariants, -Wformat and so on. Most of these have both positive and negative forms; the negative form of -ffoo is -fno-foo. This manual documents only one of these two forms, whichever one is not the default.

    from the GCC manual, intro part of section 3: Invoking GCC 3

    (-m options follow the same convention as -f and -W long options.)

    This style of foo vs. no-foo is not unique to GCC; it's pretty common.


    Faulting on _mm256_lddqu_si256 after compiling with -mavx512vl

    GCC is dumb and uses an EVEX encoding for the load (probably vmovdqu64) instead of a shorter VEX encoding. But you told it AVX512VL was available, so this is only an optimization problem, not correctness.

    If you did compile the function with only AVX enabled, it would of course only use AVX instructions.