Search code examples
When source registers in avx instruction can be reused...


assemblycpu-architecturesimdavxmicro-optimization

Read More
Why can't I specify the calling convention for a constructor(C++)?...


c++visual-c++visual-studio-2013simdcalling-convention

Read More
SIMD prefix sum on Intel cpu...


c++ssesimdprefix-sum

Read More
How to convert int 64 to int 32 with avx (but without avx-512)...


simdsseavx

Read More
int8 x uint8 matrix-vector product with column-major layout...


assemblyx86simdsseavx

Read More
How can I count the occurrence of a byte in array using SIMD?...


c#.netsimdsystem.numerics

Read More
Is the "throughput" listed by Intel per thread or per core?...


assemblyx86simdsseintrinsics

Read More
How do I enable SSE4.1 and SSE3 (but NOT AVX) in MSVC...


visual-c++ssesimdsse4

Read More
What's the difference between logical SSE intrinsics?...


cssesimdintrinsicssse2

Read More
A better 8x8 bytes matrix transpose with SSE?...


cmatrixoptimizationssesimd

Read More
How to interleave 3 float vectors into an array with AVX intrinsics C++...


c++simdintrinsicsavxavx2

Read More
Why is there no SIMD functionality in the C++ standard library?...


c++stlsimd

Read More
Do AVX512 mask register reduce the execution time?...


performancex86-64simdavx512

Read More
Proper use of _mm256_maskload_ps for loading less than 8 floats into __m256...


c++simdavx

Read More
can I assign the result of intrinsic that returns __m128i to variable of the type__m128i_u?...


simdsseintrinsicssse2

Read More
OMP SIMD logical AND on unsigned long long...


c++performancebit-manipulationopenmpsimd

Read More
Using F# and SIMD to search for index of value...


f#simdavx

Read More
How can I extract a byte from __m256i AVX2 register into another __m256i register?...


csimdintrinsicsavxavx2

Read More
Unpacking 8 to 16-bit using SIMD: AVX2 version mixes up the order...


c++simdsseavx2

Read More
Getting started with Intel x86 SSE SIMD instructions...


cgccx86ssesimd

Read More
how to debug a _mm_mul_ps function?...


c++segmentation-faultssesimdintrinsics

Read More
Building GCC SIMD vector constants using constexpr functions (rather than literals)...


c++gccc++20simdconstexpr

Read More
what is dark magic behind meta.Vectors?...


simdzig

Read More
What doest `vaddhn_high_s16` actually do?...


c++simdintrinsicsarm64neon

Read More
C# .Net SIMD System.Numerics.Vector4 slower than loop...


c#vectorizationsimd

Read More
openmp omp declare uniform this not supported in GCC?...


c++gccopenmpsimd

Read More
Construct a 64 bit mask register from four 16 bit ones...


x86-64simdavx512

Read More
_mm256_rem_epu64 intrinsic not found with GCC 10.3.0...


c++simdavxavx512

Read More
_mm256_packs_epi32, except pack sequentially...


x86-64simdavx2

Read More
C++ Optimize Memory Read Speed...


c++performancesimdapple-m1memory-bandwidth

Read More
BackNext