sse Examples and Free Source Code

Shuffle 16 bit vectors SSE...

sse simd

SSE Intrinsics and loop unrolling...

c++optimization sse loop-unrolling

How to Multiply 2 16 bit vectors and store result in 32 bit vector in sse?...

c++sse simd sse2

how to deinterleave image channel in SSE...

image-processing sse simd sse2

MOVAPS accesses unaligned address...

c++visual-studio-2013 sse memory-alignment disassembly

C - How to access elements of vector using GCC SSE vector extension...

gcc sse

Unpacking a bitfield (Inverse of movmskb)...

assembly bit-manipulation sse sse2

What is the fastest way to do a SIMD gather without AVX(2)?...

x86 sse simd sse4

Is there any way to create a 16-byte aligned class that can be passed as a param...

c++windows sse

x86 Assembly (SSE): Unexpected Multiplication Result...

assembly x86 sse masm masm32

Converting gausian function into SSE...

sse simd

Efficient min() function in SSE...

c++sse

How can I set __m128i without using of any SSE instruction?...

c++constants sse simd sse2

AVX VMOVDQA slower than two SSE MOVDQA?...

assembly sse bignum arbitrary-precision avx

_mm_store_si128 throws exception...

c++sse simd

Fast implementation of covariance of two 8-bit arrays...

c++image-processing optimization sse simd

Constant floats with SIMD...

c++optimization sse simd

RyuJIT not making full use of SIMD intrinsics...

c#sse simd avx ryujit

Accepted XX:UseSSE values for Java JVM?...

java jvm sse

SSE (SIMD): multiply vector by scalar...

c x86 sse simd

What is the correct way of calculating a large CRC32...

c x86-64 sse crc32

Does Intel intrinsics load functions read from cache or RAM?...

intel sse intrinsics avx

What is the fastest way to test if a double number is integer (in modern intel X86 processors)...

c optimization assembly x86 sse

Why Do I Get A Stack Overflow Here?...

c++stack-overflow sse

_mm_sad_epu8 faster than _mm_sad_pu8...

c sse intrinsics

Tiny SSE addpd loop slightly slower than scalar on AMD Phenom II?...

c++c gcc assembly sse

Debugging xmm registers in Assembler...

visual-studio debugging assembly dll sse

Bitwise xor of two 256-bit integers...

sse simd avx

Why can't I remove _mm_empty()?...

c++sse sse2 mmx

What does this x86 assembly instruction do (addsd xmm0, ds:__xmm@41f00000000000000000000000000000[ed...

assembly x86 sse