Search code examples
Dereference pointers in XMM register (gather)...


pointersssesimd

Read More
Extracting ints and shorts from a struct using AVX?...


c++x86ssesimdavx

Read More
load 32 bits from memory into xmm register...


sseinline-assemblyintrinsicssse2mmx

Read More
Intel Intrinsics guide - Latency and Throughput...


performancex86intelsseintrinsics

Read More
SIMD prefix sum on Intel cpu...


c++ssesimdprefix-sum

Read More
Is there a difference between SVML vs. normal intrinsic square root functions?...


c++intelsseintrinsicssse2

Read More
How to convert int 64 to int 32 with avx (but without avx-512)...


simdsseavx

Read More
int8 x uint8 matrix-vector product with column-major layout...


assemblyx86simdsseavx

Read More
Is the "throughput" listed by Intel per thread or per core?...


assemblyx86simdsseintrinsics

Read More
How do I enable SSE4.1 and SSE3 (but NOT AVX) in MSVC...


visual-c++ssesimdsse4

Read More
Are there unsigned equivalents of the x87 FILD and SSE CVTSI2SD instructions?...


assemblyfloating-pointssefloating-point-conversionx87

Read More
accessing __m128 fields across compilers...


visual-c++g++sseicc

Read More
Using AVX CPU instructions: Poor performance without "/arch:AVX"...


c++performancevisual-studio-2010sseavx

Read More
SSE3 instructions in F#...


f#sse

Read More
Which versions of Windows support/require which CPU multimedia extensions? (How to check if SSE or A...


windowsassemblysseavxavx512

Read More
What's the difference between logical SSE intrinsics?...


cssesimdintrinsicssse2

Read More
Does _mm_stream_load_si128 (movntdqa) modify the memory its argument points to?...


cassemblyx86sseintrinsics

Read More
A better 8x8 bytes matrix transpose with SSE?...


cmatrixoptimizationssesimd

Read More
fast multiplication of int8 arrays by scalars...


cassemblyx86sse8-bit

Read More
Finding Next Ascii Space With _mm_cmpeq_epi8 Returning 0...


csseintrinsics

Read More
can I assign the result of intrinsic that returns __m128i to variable of the type__m128i_u?...


simdsseintrinsicssse2

Read More
Unpacking 8 to 16-bit using SIMD: AVX2 version mixes up the order...


c++simdsseavx2

Read More
Getting started with Intel x86 SSE SIMD instructions...


cgccx86ssesimd

Read More
my intrinsic function in getting the dot product of an int array is slower than the normal code, wha...


c++cpusseintrinsicsdot-product

Read More
how to debug a _mm_mul_ps function?...


c++segmentation-faultssesimdintrinsics

Read More
Why does inverting the parameters to a CMPGT comparison function work as a CMPLT?...


c++sseintrinsicsavx2

Read More
How _mm_prefetch works?...


assemblycachingsseintrinsicsprefetch

Read More
It is possible move 8 bits from an XMM register to memory without using general purpose registers?...


assemblynasmsse

Read More
Is there an AVX2 instruction (and intrinsic) to broadcast load a 16 bit value 16 times into an __m25...


c++sseintrinsicsavxavx2

Read More
Check XMM register for all zeroes...


c++ssesimdintrinsics

Read More
BackNext