Search code examples
Shuffle 16 bit vectors SSE...


ssesimd

Read More
SSE Intrinsics and loop unrolling...


c++optimizationsseloop-unrolling

Read More
How to Multiply 2 16 bit vectors and store result in 32 bit vector in sse?...


c++ssesimdsse2

Read More
how to deinterleave image channel in SSE...


image-processingssesimdsse2

Read More
MOVAPS accesses unaligned address...


c++visual-studio-2013ssememory-alignmentdisassembly

Read More
C - How to access elements of vector using GCC SSE vector extension...


gccsse

Read More
Unpacking a bitfield (Inverse of movmskb)...


assemblybit-manipulationssesse2

Read More
What is the fastest way to do a SIMD gather without AVX(2)?...


x86ssesimdsse4

Read More
Is there any way to create a 16-byte aligned class that can be passed as a param...


c++windowssse

Read More
x86 Assembly (SSE): Unexpected Multiplication Result...


assemblyx86ssemasmmasm32

Read More
Converting gausian function into SSE...


ssesimd

Read More
Efficient min() function in SSE...


c++sse

Read More
How can I set __m128i without using of any SSE instruction?...


c++constantsssesimdsse2

Read More
AVX VMOVDQA slower than two SSE MOVDQA?...


assemblyssebignumarbitrary-precisionavx

Read More
_mm_store_si128 throws exception...


c++ssesimd

Read More
Fast implementation of covariance of two 8-bit arrays...


c++image-processingoptimizationssesimd

Read More
Constant floats with SIMD...


c++optimizationssesimd

Read More
RyuJIT not making full use of SIMD intrinsics...


c#ssesimdavxryujit

Read More
Accepted XX:UseSSE values for Java JVM?...


javajvmsse

Read More
SSE (SIMD): multiply vector by scalar...


cx86ssesimd

Read More
What is the correct way of calculating a large CRC32...


cx86-64ssecrc32

Read More
Does Intel intrinsics load functions read from cache or RAM?...


intelsseintrinsicsavx

Read More
What is the fastest way to test if a double number is integer (in modern intel X86 processors)...


coptimizationassemblyx86sse

Read More
Why Do I Get A Stack Overflow Here?...


c++stack-overflowsse

Read More
_mm_sad_epu8 faster than _mm_sad_pu8...


csseintrinsics

Read More
Tiny SSE addpd loop slightly slower than scalar on AMD Phenom II?...


c++cgccassemblysse

Read More
Debugging xmm registers in Assembler...


visual-studiodebuggingassemblydllsse

Read More
Bitwise xor of two 256-bit integers...


ssesimdavx

Read More
Why can't I remove _mm_empty()?...


c++ssesse2mmx

Read More
What does this x86 assembly instruction do (addsd xmm0, ds:__xmm@41f00000000000000000000000000000[ed...


assemblyx86sse

Read More
BackNext