Search code examples
accessing __m128 fields across compilers...

visual-c++g++sseicc

Read More
Using AVX CPU instructions: Poor performance without "/arch:AVX"...

c++performancevisual-studio-2010sseavx

Read More
SSE3 instructions in F#...

f#sse

Read More
Which versions of Windows support/require which CPU multimedia extensions? (How to check if SSE or A...

windowsassemblysseavxavx512

Read More
What's the difference between logical SSE intrinsics?...

cssesimdintrinsicssse2

Read More
Does _mm_stream_load_si128 (movntdqa) modify the memory its argument points to?...

cassemblyx86sseintrinsics

Read More
A better 8x8 bytes matrix transpose with SSE?...

cmatrixoptimizationssesimd

Read More
fast multiplication of int8 arrays by scalars...

cassemblyx86sse8-bit

Read More
Finding Next Ascii Space With _mm_cmpeq_epi8 Returning 0...

csseintrinsics

Read More
can I assign the result of intrinsic that returns __m128i to variable of the type__m128i_u?...

simdsseintrinsicssse2

Read More
Unpacking 8 to 16-bit using SIMD: AVX2 version mixes up the order...

c++simdsseavx2

Read More
Getting started with Intel x86 SSE SIMD instructions...

cgccx86ssesimd

Read More
my intrinsic function in getting the dot product of an int array is slower than the normal code, wha...

c++cpusseintrinsicsdot-product

Read More
how to debug a _mm_mul_ps function?...

c++segmentation-faultssesimdintrinsics

Read More
Why does inverting the parameters to a CMPGT comparison function work as a CMPLT?...

c++sseintrinsicsavx2

Read More
How _mm_prefetch works?...

assemblycachingsseintrinsicsprefetch

Read More
It is possible move 8 bits from an XMM register to memory without using general purpose registers?...

assemblynasmsse

Read More
Is there an AVX2 instruction (and intrinsic) to broadcast load a 16 bit value 16 times into an __m25...

c++sseintrinsicsavxavx2

Read More
Check XMM register for all zeroes...

c++ssesimdintrinsics

Read More
No insert and extract for float/double in SSE and AVX?...

c++floating-pointssesimdavx

Read More
How to load 16 bytes of memory into a Rust __m128i?...

rustssesimdintrinsics

Read More
How to detect sse availability in CMake...

buildcross-platformcmakesse

Read More
Unable to compile assembly code with xmmword operand-size using nasm...

assemblynasmsse128-bit

Read More
Why is SIMD slower than scalar counterpart...

assemblyx86ssesimd

Read More
When does data move around between SSE registers and the stack?...

c++ssesimdcpu-registersregister-allocation

Read More
Segfaults with Intel Intrinsics...

cintelsseintrinsicsmemory-alignment

Read More
Fast byte-wise replace if...

coptimizationx86ssesimd

Read More
emmintrin.h:31:3: error: #error "SSE2 instruction set not enabled" # error "SSE2 inst...

c++linuxmakefilecmakesse

Read More
shuffling upper 32 bits with lower 32 bits in m128...

cssesimdintrinsics

Read More
result with/without SSE simd operation is different...

c++sse

Read More
BackNext