Search code examples
Fast pyrDown image with AVX instructions...

c++image-processingcomputer-visionsseavx

Read More
Do I need to use _mm256_zeroupper in 2021?...

c++ssesimdintrinsicsavx

Read More
Understanding the SIMD shuffle control mask...

cgccsimdavx

Read More
Do 128bit cross lane operations in AVX512 give better performance?...

performancex86intelavxavx512

Read More
Get sum of values stored in __m256d with SSE/AVX...

c++optimizationsseavxavx2

Read More
How to perform the inverse of _mm256_movemask_epi8 (VPMOVMSKB)?...

cx86simdavxavx2

Read More
SSE/AVX: Choose from two __m256 float vectors based on per-element min and max absolute value...

sseintrinsicsavxavx512

Read More
Memory argument of VMOVDQU partially out of allocated range...

assemblyx86-64simdaccess-violationavx

Read More
Why is masking needed before using a pshufb shuffle as a lookup table for nibbles?...

c++simdsseavxavx2

Read More
Unpacking real and imaginary parts of complex numbers into separate ymm registers...

complex-numbersintrinsicsavxavx2

Read More
C: is it possible to cast a uint64_t to const __m256i_u?...

cgccx86-64avx

Read More
How to use Vector Class Library for AVX vectorization together with the openmp #pragma omp parallel ...

c++parallel-processingopenmpavxvector-class-library

Read More
AVX2 _mm256_cmp_pd to return number values...

c++x86comparisonavxavx2

Read More
How to check if a CPU supports the SSE3 instruction set?...

c++sseinstruction-setavxcpuid

Read More
How to tell if a Linux machine supports AVX/AVX2 instructions?...

linuxunixavxsuseavx2

Read More
Intel Intrinsics Guide relative error definition...

x86floating-pointprecisionintrinsicsavx

Read More
bad_function_call thrown and segmentation fault caused when passing avx variables to std::function...

c++debuggingg++vectorizationavx

Read More
The AVX intrinsic _mm256_rsqrt_ps has much greater relative error than it should have according to t...

c++floating-pointintrinsicsavx

Read More
Using SIMD/AVX/SSE for tree traversal...

performanceassemblysimdmicro-optimizationavx

Read More
What happens when I compile on machine that supports avx2 and run the binary on another machine that...

c++avxavx2

Read More
How do you handle indivisible vector lengths with SIMD intrinsics, array not a multiple of vector wi...

c++vectorizationsimdintrinsicsavx

Read More
Does AVX/AVX2 "exists" on each core?...

c++cpu-architecturesimdavxavx2

Read More
Does anyone have an example where _mm256_stream_load_si256 (non-tempral load to bypasse cache) actua...

performancex86cpu-architecturehpcavx

Read More
How to find the horizontal maximum in a 256-bit AVX vector...

x86simdavxvector-processingavx2

Read More
How to check inf for AVX intrinsic __m256...

c++csseintrinsicsavx

Read More
SSE/AVX: using float shuffles + casts as substitute for missing integer shuffle intrinsics?...

x86sseavx

Read More
Does .NET Framework 4.5 provide SSE4/AVX support?...

.netsimd.net-4.5avxsse4

Read More
The Effect of Architecture When Using SSE / AVX Intrinisics...

gccsseintrinsicsavxicc

Read More
Rust target-cpu=native gets slower SIMD execution...

rustsimdintrinsicsavx

Read More
Horizontal min on avx2 8 float register and shuffle paired registers alongside...

c++simdsseavxavx2

Read More
BackNext