I encountered a problem when converting binary values to float
, prompting me to research the issue further. I found out that the values between 0xff800001
and 0xffb00000
when are reinterpreted as float
have their 15th 22nd LSB bit flipped. The test program I used for this:
unsigned long long ca = 0;
unsigned long long cb = 0;
for(unsigned long long tmpLongLong = 0x00000000; tmpLongLong <= 0xffffffff; tmpLongLong++)
{
unsigned long tmpLong1 = tmpLongLong;
float tmpFloat = *(reinterpret_cast<float*>(&tmpLong1));
unsigned long tmpLong2 = *(reinterpret_cast<unsigned long*>(&tmpFloat));
ca++;
if(tmpLong1 != tmpLong2)
{
cb++;
}
cout << (tmpLong2 == tmpLong1 ? "YES " : "NO ") << std::hex << tmpLong1 << " vs. " << tmpLong2 << std::dec << endl;
}
cout << "bad: " << cb << "/" << ca << " " << 100.0 / ca * cb << "%" << endl;
Example for the output for 2 values that are corrupted:
NO ff8003cf vs. ffc003cf
NO ff8003d0 vs. ffc003d0
What is the cause for this issue and how do I overcome it?
This is due to your C++ implementation silencing signaling NaNs.
Note that the difference is in bit 22, not bit 15 as stated in the question. For example, in the case where the before and after values are ff8003cf16 and ffc003cf16, log2(ff8003cf16−ffc003cf16) = 22.
When your C++ implementation assigns a float
value to a float
object, and the value is a signaling NaN, it sets bit 22 to make it a quiet NaN.
In the IEEE-754 interchange format for the binary32 format (commonly used for float
), the bits represent a NaN if the exponent bits (30 to 23) are all on and the significand bits (22 to 0) are not all zero. If the first bit of the significand (22) is set, it is a quiet NaN (one that does not signals an exception when used). If it is clear, it is a signaling NaN (one that signals an exception when used). (“Signal” is used here in the IEEE-754 sense of indicating an exceptional condition has occurred in the operation, not in the C++ signal of changing the flow of program control, although that is a potential result of a floating-point signal.)
Commonly, assigning a float
value to a float
object, as occurs in float tmpFloat = *(reinterpret_cast<float*>(&tmpLong1));
, is treated as a copy operation and does not alter the value or signal exceptions. Your C++ implementation seems to be treating it as a signaling operation, so that assigning a signaling NaN value results in signaling an exception (which may be ignored or may raise a flag in the floating-point exception flags) and producing a quiet NaN as a result. The signaling NaN is converted to a quiet NaN by setting bit 22.
If this is what your C++ implementation is doing, there might not be any way to overcome it while assigning float
values. You can get the desired bits into a float
by copying the bytes that represent it (see below), and you can also get them out by copying. But likely any use of the float
value as a float
will result in silencing signaling NaNs.
Note that reinterpreting the bits of an object via a reinterpret cast of its pointer is an abuse of C++. A common result is that, when using the resulting pointer, the bits are reinterpreting in the new type. However, it is not guaranteed by the C standard, and a proper method is to copy the bits into a new object, as with float tmpFloat; std::memcpy(&tmpFloat, &tmpLong1, sizeof tmpFloat);
. A forthcoming C++ standard may have a new std::bitcast<To, From>(const From &from)
declared in the <bit>
header that will reinterpret the bits of from
in the type To
.