I googled a lot but I could not manage to succeed in compiling C program using _mm_clflushopt
function. _mm_clflush
works fine but I want to be able to try optimized version as well. I checked in cpu flags and clflushopt is included. I am using emmintrin.h and immintrin.h both libraries but at compilation I still get "undefined reference to _mm_clflushopt" error. I am running gcc -o prog prog.c in linux terminal.
Using x86intrin.h library gives me this error during the compilation:
error: inlining failed in call to always_inline '_mm_clflushopt'
I would appreciate any help, I am super new to this instructions though after trying to find more information, I was not really able to find C code with optimized version. That's why I decided to ask a question.
GCC only lets you use intrinsics that the target CPU supports. GCC will never emit clflushopt
on its own, but this rule makes more sense for extensions like AVX2, where gcc does know how to auto-vectorize with AVX2 if you let it. And you have to enable usage of AVX2 instructions before GCC will allow itself to emit them, even if your source uses intrinsics.
Use gcc -O3 -march=native
to enable use of all the extensions present on the CPU you're compiling on. (-march
still works without enabling optimization, but I put it in for future readers that are going to copy/paste the bolded part.)
Or -march=skylake
or -march=znver1
(Zen) for example to compile for a specific target CPU regardless of what host you're compiling on. See https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html
The specific option for just CLFLUSHOPT is -mclflushopt
, but using -march=skylake
also sets -mtune=skylake
, which you also want. And enables AVX2 and earlier, FMA (yes that's separate from AVX2), and BMI1/BMI2, popcnt, RDRAND, RDSEED, and lots of other goodies. (Compile with -march=skylake -fverbose-asm -S
and look at the asm comments at the top of the file to see all the -m
options enabled / not enabled.)