Search code examples
count number of unique values in a 128bit avx vector, or detecting if all elements are equal?...


csimdsseintrinsicsavx

Read More
How to emulate _mm256_loadu_epi32 with gcc or clang?...


c++cintrinsicsavx512

Read More
Why does does SSE set (_mm_set_ps) reverse the order of arguments...


c++csimdsseintrinsics

Read More
Difference between __builtin_addcll and _addcarry_u64...


c++clangintelintrinsicsbuilt-in

Read More
Can't use uint64_t with rdrand as it expects unsigned long long, but uint64_t is defined as unsi...


c++long-integerportabilityintrinsicsrdrand

Read More
Linking error when building without CRT, memcpy and memset intrinsic functions...


c++memcpyintrinsicscrtmemset

Read More
Efficient overflow-immune arithmetic mean in C/C++...


c++coptimizationcompiler-optimizationintrinsics

Read More
Manipulate vector register as float32x4_t C variable in ARM...


cassemblyinline-assemblyarm64intrinsics

Read More
Why does Clang complain about alignment on SSE intrinsic unaligned loads...


clangsseintrinsicsmemory-alignment

Read More
Population count in AVX512...


simdintrinsicsavx512hammingweight

Read More
MSVC 2019 _fxrstor64 and _fxsave64 intrinsics availability...


c++visual-c++intrinsics

Read More
What are the names and meanings of the intrinsic vector element types, like epi64x or pi32?...


intelsseintrinsicssse2mmx

Read More
Why does the pseudocode of _mm_insert_ps calculate %8?...


intrinsicssse4

Read More
Difference between _mm256_extractf32x4_ps and _mm256_extractf128_ps...


c++cintrinsicsavxavx512

Read More
What is "MAX" referring to in the intel intrinsics documentation?...


c++cintrinsicsavxavx512

Read More
What is the correct intrinsic sequence to do PSRLDQ to an XMM register while keeping the YMM part un...


cassemblyx86intrinsicsavx

Read More
How to constexpr initialize intrinsic SSE/AVX register?...


c++sseconstexprintrinsicsavx

Read More
What is the difference between these 128bit SIMD xor operations...


simdsseintrinsicssse2

Read More
Using Intrinsics to Extract And Shift Odd/Even Bits...


c++bit-manipulationintrinsicsmicro-optimization

Read More
What is the most efficient way to handle integer multiplication overflow with saturation with ARM Ne...


armsimdintrinsicsneonsaturation-arithmetic

Read More
ARMv7 NEON: Unpack 32 bit mask to 64 bit mask...


c++armsimdintrinsicsneon

Read More
Organizing multiple implementations (for SIMD)...


c++simdintrinsicsinstruction-set

Read More
Discrepancy in result of Intrinsics vs Naive Vector reduction...


c++vectorsimdieee-754intrinsics

Read More
What is the equivalent of v4sf and __attribute__ in Visual Studio C++?...


c++gccvisual-c++sseintrinsics

Read More
Rust compiler not optimising lzcnt? (and similar functions)...


rustx86bit-manipulationcompiler-optimizationintrinsics

Read More
How does the _mm256_shuffle_epi8 make sense in this Game of Life implementation?...


c++intrinsicsavxconways-game-of-life

Read More
AVX2: BitScanReverse or CountLeadingZeros on 8 bit elements in AVX register...


c++simdintrinsicsavxavx2

Read More
AVX2: CountTrailingZeros on 8 bit elements in AVX register...


c++simdintrinsicsavxavx2

Read More
Using Half Precision Floating Point on x86 CPUs...


c++cx86intrinsicshalf-precision-float

Read More
_umul128 on Windows 32 bits...


visual-c++x86multiplicationbigintegerintrinsics

Read More
BackNext