Search code examples
c++integerssesimdintrinsics

Most efficient way to check if all __m128i components are 0 [using <= SSE4.1 intrinsics]


I am using SSE intrinsics to determine if a rectangle (defined by four int32 values) has changed:

__m128i oldRect; // contains old left, top, right, bottom packed to 128 bits
__m128i newRect; // contains new left, top, right, bottom packed to 128 bits

__m128i xor = _mm_xor_si128(oldRect, newRect);

At this point, the resulting xor value will be all zeros if the rectangle hasn't changed. What is then the most efficient way of determining that?

Currently I am doing so:

if (xor.m128i_u64[0] | xor.m128i_u64[1])
{
    // rectangle changed
}

But I assume there's a smarter way (possibly using some SSE instruction that I haven't found yet).

I am targeting SSE4.1 on x64 and I am coding C++ in Visual Studio 2013.

Edit: The question is not quite the same as Is an __m128i variable zero?, as that specifies "on SSE-2-and-earlier processors" (although Antonio did add an answer "for completeness" that addresses 4.1 some time after this question was posted and answered).


Solution

  • You can use the PTEST instuction via the _mm_testz_si128 intrinsic (SSE4.1), like this:

    #include "smmintrin.h" // SSE4.1 header
    
    if (!_mm_testz_si128(xor, xor))
    {
        // rectangle has changed
    }
    

    Note that _mm_testz_si128 returns 1 if the bitwise AND of the two arguments is zero.