Search code examples
Instruction/intrinsic for taking higher half of uint64_t in C++?...


c++cbit-manipulationintrinsicsinstructions

Read More
What series of intrinsics will complete this paeth prediction code?...


c++sseintrinsics

Read More
Convert 16 bits mask to 16 bytes mask...


c++cbit-manipulationsseintrinsics

Read More
SSE2 intrinsics - comparing unsigned integers...


c++x86ssesimdintrinsics

Read More
How to use VC++ intrinsic functions w/o run-time library...


c++visual-c++intrinsicsmemsetdemoscene

Read More
how to set a int32 value at some index within an m128i with only SSE2?...


c++ssesimdintrinsicssse2

Read More
Building sqlite3mc amalgamation fails with ‘_mm_aesimc_si128’: target specific option mismatch - Eve...


c++cmakefileintelintrinsics

Read More
Load or shuffle a pair of floats with SIMD intrinsics for doubles?...


cssesimdintrinsicsavx

Read More
SIMD vectorization strategies for group-by operations on multiple, very large data arrays...


c#performancex86simdintrinsics

Read More
Intrinsic __lzcnt64 returns different values with different compile options...


cgccx86intrinsicsbmi

Read More
How do the AVX(2) gather instructions actually compute the fetch address?...


c++simdintrinsicsavxavx2

Read More
Fastest way to set __m256 value to all ONE bits...


bit-manipulationintrinsicsavxavx2

Read More
AVX2 set __mm256d variable to all ones...


cvectorizationintrinsicsavxavx2

Read More
How can I convert u8 mask to u32 mask with ARM NEON intrinsic?...


csimdintrinsicsneon

Read More
_mm256_loadu_epi64, _mm256_storeu_epi64 require avx512vl?...


c++clangintrinsicsavx2avx512

Read More
Gcc misoptimises sse function...


c++gccsseintrinsicsstrict-aliasing

Read More
Memory alignment of Armadillo vectors vec/fvec...


c++performanceintrinsicsarmadillo

Read More
How to convert scalar code of the double version of VDT's Pade Exp fast_ex() approx into SSE2?...


c++sseintrinsicssse2exp

Read More
Xcode in release mode fails to compile <immintrin.h> - complains about __builtin_ia32_emms()...


c++xcodex86-64simdintrinsics

Read More
Can you pass generics to .NET Core hardware intrinsics methods?...


c#.net-coreintrinsics

Read More
How is the arch parameter used when compiling code with visual studio?...


visual-c++compiler-optimizationsimdintrinsicsavx

Read More
Implementing C# hardware intrinsics wrapper issue...


c#intrinsics.net-5

Read More
How are __addgs* used, and what is GS?...


visual-c++x86-64intrinsicsthread-local-storagememory-segmentation

Read More
How to best emulate the logical meaning of _mm_slli_si128 (128-bit bit-shift), not _mm_bslli_si128...


cssesimdintrinsicssse2

Read More
Is _mm_prefetch asynchronous? Profiling shows a lot of cycles on it...


c++performancex86intrinsicsprefetch

Read More
Better way to store or extract scalar int result using SSE2 intrinsic...


csseintrinsicssse2

Read More
Segfault while creating a vector of avx vectors...


c++vectorsegmentation-faultintrinsicsavx

Read More
Get member of __m128 by index?...


c++clangssesimdintrinsics

Read More
CUDA half float operations without explicit intrinsics...


cudaintrinsicsnvccfmahalf-precision-float

Read More
Analog of _mm256_cmp_epi32_mask for AVX2...


c++performanceoptimizationintrinsicsavx2

Read More
BackNext