Search code examples
_mm_max_ss has different behavior between clang and gcc...


c++gccx86clangsse

Read More
_mm_load_si128 loads data in reverse order...


cssesimdsse2

Read More
Gcc misoptimises sse function...


c++gccsseintrinsicsstrict-aliasing

Read More
How to convert scalar code of the double version of VDT's Pade Exp fast_ex() approx into SSE2?...


c++sseintrinsicssse2exp

Read More
What is the "correct" way to go from avx/sse masks to avx512 masks?...


c++sseavxavx512

Read More
SSE Compare Packed Unsigned Bytes...


x86comparisonunsignedsse

Read More
How to best emulate the logical meaning of _mm_slli_si128 (128-bit bit-shift), not _mm_bslli_si128...


cssesimdintrinsicssse2

Read More
What does ordered / unordered comparison mean?...


assemblyx86floating-pointsse

Read More
Count integers in an array where the set bits are a subset of a given mask...


c++optimizationsseavxbitmask

Read More
Which are the use case of punpcklbw (interleave in MMX/SSE/AVX)?...


assemblycompressionssedisassemblymemset

Read More
Better way to store or extract scalar int result using SSE2 intrinsic...


csseintrinsicssse2

Read More
What is packed and unpacked and extended packed data...


cpu-architecturessesimdavxavx2

Read More
x86 SIMD instructions 16 byte alignment in assembly (Without C intrinsics)...


assemblyx86-64ssesimdmemory-alignment

Read More
Expand the lower two 32-bit floats of an xmm register to the whole xmm register...


assemblyx86sse

Read More
Get member of __m128 by index?...


c++clangssesimdintrinsics

Read More
Writing a portable SSE/AVX version of std::copysign...


c++x86-64ssesimdavx

Read More
SSE optimization of Gaussian blur...


c++optimizationssesimdgaussianblur

Read More
How to calculate mod/remainder using SSE?...


assemblyssedivision

Read More
Most recent processor without support of SSSE3 instructions?...


x86ssesimdinstruction-set

Read More
How to combine two __m128 values to __m256?...


cx86ssesimdavx

Read More
Vectorization of modulo multiplication...


c++algorithmssesimdavx

Read More
Comparing quadwords in xmm...


assemblyx86nasmsse

Read More
Libc hypot function seems to return incorrect results for double type... why?...


c++floating-pointsseglibchypotenuse

Read More
Why move 32-bit register to stack then from stack to xmm register?...


assemblyx86sseattmicro-optimization

Read More
Set an XMM register to a repeating byte pattern (broadcast a constant byte)...


assemblyssemicro-optimizationsse2

Read More
Multiplying different types in AVX512...


c++csseavxavx512

Read More
Why does GCC or Clang not optimise reciprocal to 1 instruction when using fast-math...


c++ssecompiler-optimizationsimdfast-math

Read More
How do I clamp __m128i signed integers into non-negative unsigned integers in SSE...


c++maxsseclamp

Read More
is it safe to use xmm registers to save the general-purpose ones?...


assemblyx86sseinline-assembly

Read More
Is there a shift 128/256 bits by 1 instruction?...


ssesimdavx

Read More
BackNext