I am working on an application that is converting Float samples in the range of -1.0 to 1.0 to signed 16bit, to ensure the output of the optimized (SSE) routines are accurate I have written a set of tests that runs the non optimized version against the SSE version and compares their output.
Before I start I have confirmed that the SSE rounding mode is set to nearest.
In my test case the formula is:
ratio = 65536 / 2
output = round(input * ratio)
For the most part the results are accurate, but on one particular input I am seeing a failure for an input of -0.8499908447265625
.
-0.8499908447265625 * (65536 / 2) = -27852.5
The normal code correctly rounds this to -27853
, but the SSE code rounds this to -27852
.
Here is the SSE code in use:
void Float_S16(const float *in, int16_t *out, const unsigned int samples)
{
static float ratio = 65536.0f / 2.0f;
static __m128 mul = _mm_set_ps1(ratio);
for(unsigned int i = 0; i < samples; i += 4, in += 4, out += 4)
{
__m128 xin;
__m128i con;
xin = _mm_load_ps(in);
xin = _mm_mul_ps(xin, mul);
con = _mm_cvtps_epi32(xin);
out[0] = _mm_extract_epi16(con, 0);
out[1] = _mm_extract_epi16(con, 2);
out[2] = _mm_extract_epi16(con, 4);
out[3] = _mm_extract_epi16(con, 6);
}
}
Self Contained Example as requested:
/* standard math */
float ratio = 65536.0f / 2.0f;
float in [4] = {-1.0, -0.8499908447265625, 0.0, 1.0};
int16_t out[4];
for(int i = 0; i < 4; ++i)
out[i] = round(in[i] * ratio);
/* sse math */
static __m128 mul = _mm_set_ps1(ratio);
__m128 xin;
__m128i con;
xin = _mm_load_ps(in);
xin = _mm_mul_ps(xin, mul);
con = _mm_cvtps_epi32(xin);
int16_t outSSE[4];
outSSE[0] = _mm_extract_epi16(con, 0);
outSSE[1] = _mm_extract_epi16(con, 2);
outSSE[2] = _mm_extract_epi16(con, 4);
outSSE[3] = _mm_extract_epi16(con, 6);
printf("Standard = %d, SSE = %d\n", out[1], outSSE[1]);
Although the SSE rounding mode defaults to "round to nearest", it's not the old familiar rounding method that we all learned in school, but a slightly more modern variation known as Banker's rounding (aka unbiased rounding, convergent rounding, statistician's rounding, Dutch rounding, Gaussian rounding or odd–even rounding), which rounds to the nearest even integer value. This rounding method is supposedly better than the more traditional method, from a statistical perspective. You will see the same behaviour with functions such as rint(), and it is also the default rounding mode for IEEE-754.
Note also that whereas the standard library function round() uses the traditional rounding method, the SSE instruction ROUNDPS
(_mm_round_ps
) uses banker's rounding.