Search code examples
MSVC 2019 _fxrstor64 and _fxsave64 intrinsics availability...

c++visual-c++intrinsics

Read More
What are the names and meanings of the intrinsic vector element types, like epi64x or pi32?...

intelsseintrinsicssse2mmx

Read More
Why does the pseudocode of _mm_insert_ps calculate %8?...

intrinsicssse4

Read More
Difference between _mm256_extractf32x4_ps and _mm256_extractf128_ps...

c++cintrinsicsavxavx512

Read More
What is "MAX" referring to in the intel intrinsics documentation?...

c++cintrinsicsavxavx512

Read More
What is the correct intrinsic sequence to do PSRLDQ to an XMM register while keeping the YMM part un...

cassemblyx86intrinsicsavx

Read More
How to constexpr initialize intrinsic SSE/AVX register?...

c++sseconstexprintrinsicsavx

Read More
What is the difference between these 128bit SIMD xor operations...

simdsseintrinsicssse2

Read More
Using Intrinsics to Extract And Shift Odd/Even Bits...

c++bit-manipulationintrinsicsmicro-optimization

Read More
What is the most efficient way to handle integer multiplication overflow with saturation with ARM Ne...

armsimdintrinsicsneonsaturation-arithmetic

Read More
ARMv7 NEON: Unpack 32 bit mask to 64 bit mask...

c++armsimdintrinsicsneon

Read More
Organizing multiple implementations (for SIMD)...

c++simdintrinsicsinstruction-set

Read More
Discrepancy in result of Intrinsics vs Naive Vector reduction...

c++vectorsimdieee-754intrinsics

Read More
What is the equivalent of v4sf and __attribute__ in Visual Studio C++?...

c++gccvisual-c++sseintrinsics

Read More
Rust compiler not optimising lzcnt? (and similar functions)...

rustx86bit-manipulationcompiler-optimizationintrinsics

Read More
How does the _mm256_shuffle_epi8 make sense in this Game of Life implementation?...

c++intrinsicsavxconways-game-of-life

Read More
AVX2: BitScanReverse or CountLeadingZeros on 8 bit elements in AVX register...

c++simdintrinsicsavxavx2

Read More
AVX2: CountTrailingZeros on 8 bit elements in AVX register...

c++simdintrinsicsavxavx2

Read More
Using Half Precision Floating Point on x86 CPUs...

c++cx86intrinsicshalf-precision-float

Read More
_umul128 on Windows 32 bits...

visual-c++x86multiplicationbigintegerintrinsics

Read More
access violation _mm_store_si128 SSE Intrinsics...

c++x86simdsseintrinsics

Read More
Merge two bitmask with conflict resolving, with some required distance between any two set bits...

c++x86bit-manipulationintrinsics

Read More
How to load into __m256 from a float* but reading backwards in memory as opposed to forwards?...

c++cx86-64intrinsicsavx

Read More
ARM NEON: Regular C code is faster than ARM Neon code in simple multiplication?...

armsimdintrinsicsneon

Read More
How do I enable all Intel Intrinsic options in GCC?...

gccx86intrinsics

Read More
AVX512 - How to move all set bits to the right?...

cbit-manipulationsimdintrinsicsavx512

Read More
Are there are ARM Neon instructions for round function?...

carmroundingintrinsicsneon

Read More
Accumulating a running-total (prefix sum) horizontally across an __m256i vector...

cvectorizationx86-64intrinsicsavx2

Read More
What are _mm_prefetch() locality hints?...

c++x86-64intrinsicscpu-cacheprefetch

Read More
AVX2: Is there a way to implement _mm256_mul_epi8 function for a constant power of 2?...

c++simdintrinsicsavxavx2

Read More
BackNext