Given this code: (https://godbolt.org/z/qGneEne7x)
#include <stdio.h>
#include <xmmintrin.h>
int main ()
{
double d[2] = { 161785254.0, -215713672.0 };
int i[4];
unsigned u[4];
__m128d vd = _mm_loadu_pd(d);
__m128i vi = _mm_cvtpd_epi32(vd);
_mm_storeu_si128((void *)i, vi);
_mm_storeu_si128((void *)u, vi);
printf("Doubles to convert : %.1f %.1f\n", d[0], d[1]);
printf("\n");
printf("SIMD double to signed int : %i %i\n", i[0], i[1]);
printf("SIMD double stored to unsigned: %u %u\n", u[0], u[1]);
printf("\n");
printf("Scalar double to signed int : %i %i\n", (int)d[0], (int)d[1]);
printf("Scalar double to unsigned : %u %u\n", (unsigned)d[0], (unsigned)d[1]);
}
Why does optimization change the result of the double to unsigned conversion in scalar code?
This is what it prints out with -O0:
Doubles to convert : 161785254.0 -215713672.0
SIMD double to signed int : 161785254 -215713672
SIMD double stored to unsigned: 161785254 4079253624
Scalar double to signed int : 161785254 -215713672
Scalar double to unsigned : 161785254 4079253624
Everything is as expected.
Turn on -O1 and scalar to unsigned becomes zero:
Scalar double to unsigned : 161785254 0
Why is that?
On some platforms, processing (unsigned)d
in a manner that would yield the same result as (unsigned)(int)d
in all cases where the latter was defined, may be more expensive than processing it in a manner that might yield different results for some such conversions. Further, for some tasks, having such computations trap when d
is outside the range of unsigned
may be more useful than having them yield a meaningless value. Rather than try to anticipate all of the situations where alternative behaviors might be more useful than (unsigned)(int)d
, the Standard simply waives jurisdiction over any situations other than those where a simple numeric conversion would yield a result within range of unsigned int
.