Search code examples
Divide 8-bit integers by 4 (or shift) using SSE...


c++x86ssesimdintrinsics

Read More
Zero remaining Bytes after first Zero in SSE Register...


c++ssesse4

Read More
inlining failed in call to always_inline ‘_mm_mullo_epi32’: target specific option mismatch...


ccmakex86ssesimd

Read More
Fastest Implementation of the Natural Exponential Function Using SSE...


coptimizationvectorizationssesimd

Read More
How to simulate pcmpgtq on sse2?...


assemblyssesimdsse2sse4

Read More
What is the most efficient way to do unsigned 64 bit comparison on SSE2?...


assemblyssesimdsse2

Read More
Set Last Value in __m128 vector register...


c++simdsseavx

Read More
Is there anything more I need to do before using SSE instructions?...


assemblyx86simdsseavx

Read More
Improve SSE (SSSE3) YUV to RGB code...


optimizationassemblyrgbsseyuv

Read More
print a __m128i variable...


cassemblyssesimdintrinsics

Read More
How does MSVC avoid mixing SSE and AVX?...


c++visual-c++sseavx

Read More
Is my understanding of AoS vs SoA advantages/disadvantages correct?...


cachingmemoryssesimddata-oriented-design

Read More
How to solve the 32-byte-alignment issue for AVX load/store operations?...


c++ssesimdmemory-alignmentavx

Read More
Can std::replace implementation make redundant writes to the passed array?...


c++language-lawyervectorizationsseavx

Read More
Dot product performance with SSE instructions: is DPPS worth using?...


assemblyx86simdssedot-product

Read More
how can I use SVML instructions...


c++x86ssesimd

Read More
How to properly use prefetch instructions?...


cachingx86sseprefetchdot-product

Read More
C++ error: ‘_mm_sin_ps’ was not declared in this scope...


c++optimizationssesimdintrinsics

Read More
What is the point of SSE2 instructions such as orpd?...


assemblyx86sseinstruction-setsse2

Read More
SSE multiplication of 4 32-bit integers...


x86ssesimdmultiplicationsse2

Read More
Do all CPUs which support AVX2 also support SSE4.2 and AVX?...


ssesimdavxavx2

Read More
Can PTEST be used to test if two registers are both zero or some other condition?...


assemblyx86sseintrinsicssse4

Read More
Determine cause of segfault when using -O3?...


c++gdbssegcc4.9

Read More
Find the first instance of a character using simd...


x86ssesimdavxavx2

Read More
What are the 128-bit to 512-bit registers used for?...


assemblyx86-64ssesimdcpu-registers

Read More
How do I enable SSE for my freestanding bootable code?...


x86sseinstruction-set

Read More
What is the meaning of "non temporal" memory accesses in x86...


x86sseassembly

Read More
Fastest way to do horizontal SSE vector sum (or other reduction)...


assemblyoptimizationfloating-pointssesimd

Read More
Benefits of x87 over SSE...


x86x86-64ssefpux87

Read More
Why does mulss take only 3 cycles on Haswell, different from Agner's instruction tables? (Unroll...


cassemblyx86ssemicro-optimization

Read More
BackNext