In one of the solution, he/she found abs(inp) for AVX vectors as,
__m256 sign_bit = _mm256_set1_ps(-0.0f); __m256 inp_abs = _mm256_andnot_ps(sign_bit, inp);
What's the logic behind it?.
SSE/AVX: Choose from two __m256 float vectors based on per-element min and max absolute value
IEEE 754 represents floating-point numbers with a sign bit, significand and exponent. The sign bit is set for a negative number and clear for a positive number. So absolute value can be computed by simply clearing the sign bit of a number.
The number -0.0f
has significand magnitude and exponent which are all-bits-zero, and negative sign, so its binary representation will have the sign bit set and all other bits clear. Therefore it can be used as a mask for the sign bit. The _mm256_set1_ps
intrinsic broadcasts this 32-bit value to all the elements of a 256-bit vector sign_bit
, and _mm256_andnot_ps(sign_bit, inp)
computes the bitwise AND of inp
with the NOT of sign_bit
, that is inp & ~sign_bit
, which effectively clears the sign bit of each element and doesn't change anything else.