Search code examples
Sorting 64-bit structs using AVX?...

c++sortingsimdintrinsicsavx

Read More
How to set all the values in AVX ymm register to be the same (all are 0/1/specific value)?...

assemblycpu-registersavxavx2

Read More
How do I perform a bitwise NOT in SSE/AVX?...

x86bit-manipulationsimdsseavx

Read More
Why does SIMD have single data instructions when it's called SIMD?...

cpu-architecturesimdssecpu-registersavx

Read More
C++ error: intrinsic function was not declared in scope...

c++gccintrinsicsavxavx2

Read More
How to use the Intel AVX in Java?...

javasimdavx

Read More
Casting structs to add definition to a shared-memory block in a SIMD application...

c++castingshared-memorysimdavx

Read More
MinGW64 Is Incapable of 32 Byte Stack Alignment (Required for AVX on Windows x64), Easy Work Around ...

windowsgccmingw-w64memory-alignmentavx

Read More
Theoretical maximum performance (FLOPS ) of Intel Xeon E5-2640 v4 CPU, using only addition?...

cpuintelcpu-architectureavxflops

Read More
Implementing matrix operation using AVX in C...

cmatrixmatrix-multiplicationsimdavx

Read More
What is the /d2vzeroupper MSVC compiler optimization flag doing?...

c++visual-c++avxcompiler-flagsamd-processor

Read More
Bit-twiddling Wizardry for Index of Min or Max Element in XMM/YMM/ZMM...

assemblyx86simdavx

Read More
Fastest way to do horizontal vector sum with AVX instructions...

x86ssesimdavxvector-processing

Read More
Is it possible to get multiple sines in AVX/SSE?...

windowsx86-64trigonometrysseavx

Read More
AVX divide __m256i packed 32-bit integers by two (no AVX2)...

c++simdsseavxsse2

Read More
Is it possible to popcount __m256i and store result in 8 32-bit words instead of the 4 64-bit using ...

c++intelsseavxavx2

Read More
Accumulating Doubles Into Bins via intrinsics...

c++simdavxavx2

Read More
What is causing this memory access violation error (0xC0000005) when using Eigen with "-march=n...

c++gccmingweigenavx

Read More
weird auto-vectorization in gcc with different results on godbolt...

cgccavxauto-vectorizationgodbolt

Read More
count number of unique values in a 128bit avx vector, or detecting if all elements are equal?...

csimdsseintrinsicsavx

Read More
What is the difference between MOVDQA and MOVNTDQA, and VMOVDQA and VMOVNTDQ for WB/WC marked region...

assemblyx86ssesimdavx

Read More
Using ymm registers as a "memory-like" storage location...

assemblyx86sseavx

Read More
How to compare two vectors using SIMD and get a strncmp like result?...

csimdavxavx2

Read More
How to detect SSE/SSE2/AVX/AVX2/AVX-512/AVX-128-FMA/KCVI availability at compile-time?...

gccclangsseavxavx512

Read More
Efficiently load/compute/pack 64 double comparison results in uint64_t bitmask...

c++optimizationsimdavxavx2

Read More
SSE runs slow after using AVX...

c++gccx86avxsse2

Read More
Difference between _mm256_extractf32x4_ps and _mm256_extractf128_ps...

c++cintrinsicsavxavx512

Read More
What is "MAX" referring to in the intel intrinsics documentation?...

c++cintrinsicsavxavx512

Read More
What is the correct intrinsic sequence to do PSRLDQ to an XMM register while keeping the YMM part un...

cassemblyx86intrinsicsavx

Read More
How to constexpr initialize intrinsic SSE/AVX register?...

c++sseconstexprintrinsicsavx

Read More
BackNext