Search code examples
Get member of __m128 by index?...

c++clangssesimdintrinsics

Read More
CUDA half float operations without explicit intrinsics...

cudaintrinsicsnvccfmahalf-precision-float

Read More
Analog of _mm256_cmp_epi32_mask for AVX2...

c++performanceoptimizationintrinsicsavx2

Read More
Is using C++20's std::popcount with vector optimization is equivalent to popcnt intristic?...

c++language-lawyerc++20intrinsicsavx2

Read More
Using Intel Intrinsics to quickly find sum of array of integers...

c++intrinsicsavxavx2

Read More
Packing non-contiguous vector elements in AVX (and higher)...

x86simdintrinsicsavxavx2

Read More
SSE _mm_dp_ps size result...

csseintrinsics

Read More
What is the correct way to fill a __m128i parameter, from basic type (such as short), to use with _m...

c++x86simdintrinsicsavx2

Read More
Simulating packusdw functionality with SSE2...

x86sseintrinsicssse2sse4

Read More
Why do java intrinsic functions still have code?...

javaintrinsics

Read More
making function global scope like a compiler intrinsic...

c++unit-testingarmintrinsicsglobal

Read More
Header files for x86 SIMD intrinsics...

x86header-filesssesimdintrinsics

Read More
Interleaved merging of 2 AVX-512 vector elements - C intrinsic...

chpcintrinsicsavxavx512

Read More
Fastest way to calculate a digit-sum for a large number (as a decimal string)...

cassemblysseintrinsicsavx512

Read More
Using SSE instructions with gcc without inline assembly...

cx86-64ssesimdintrinsics

Read More
How to copy X bytes or bits from an __m128i into standard memory...

c++ssesimdintrinsicssse2

Read More
optimising column-wise maximum with SIMD...

c++ssesimdintrinsicsavx

Read More
How to efficiently vectorize polynomial computation with condition (roofline model)...

eigenintrinsicsavx2auto-vectorizationmemory-bandwidth

Read More
Fastest method to calculate sum of all packed 32-bit integers using AVX512 or AVX2...

cintrinsicsavxavx2avx512

Read More
Problem including xmmintrin.h in my c++ builder application...

c++c++buildersseintrinsics

Read More
Intrinsics SIMD instruction to replace values...

c#simdintrinsics

Read More
Can't use _m_prefetchw intrinsic with gcc/clang -march=native on older Intel CPU?...

cx86clangintrinsicsprefetch

Read More
Is there a Intel SIMD comparison function that returns 0 or 1 instead of 0 or 0xFFFFFFFF?...

intelssesimdintrinsics

Read More
What is the netCore SSE2 counterpart of _mm_set1_epi32...

c#.net-coressesimdintrinsics

Read More
How would you write feature agnostic code for both AVX2 and AVX512?...

c++c-preprocessorintrinsicsavx2avx512

Read More
Gathering half-float values using AVX...

intrinsicsavxavx2half-precision-float

Read More
Compile multi-architecture code using Agner's Vector Class Library...

c++vectorizationintrinsicsavxvector-class-library

Read More
How to instruct MS Visual C++ compiler to use an uninitialized __m512i register...

c++visual-c++intrinsicsmicro-optimizationavx512

Read More
Where is Clang's '_mm256_pow_ps' intrinsic?...

clangintelsseintrinsicsavx

Read More
Using the blend instructions in intel intrinsics (AVX)...

c++cintrinsicsavximmediate-operand

Read More
BackNext