Search code examples
stlsimdavx2clamp

AVX2 equivalent of std::clamp


Given a precision p between 1 and 16, I would like to clamp an AVX2 integer register between -p/2 and p/2. I currently do this with std::clamp on non-AVX2 integers.

Is there a way of doing this with AVX2?


Solution

  • Implement saturating clamp the standard way with x = min(max(x, lower_limit), upper_limit), using whatever width of integer you want. Or let a compiler auto-vectorize std::clamp for you.

    8, 16, or 32 are convenient; AVX2 doesn't have packed min/max for 64-bit integers but you could emulate it with vpcmpgtq. AVX512 has vpmaxsq. With just SSE2, only a couple size / signedness combinations of min/max operations were available. SSE4.1 fixed that, so AVX2 has all 3 sizes in both signed and unsigned.

    For example, for 8-bit integers, _mm256_max_epi8 for signed-integer max on __m256i vectors.

    See Intel's intrinsics guide to find intrinscs.