I have the following piece of code called main.cpp
that converts an IEE 754 32-bit hex value to float and then converts it into unsigned short.
#include <iostream>
using namespace std;
int main() {
unsigned int input_val = 0xc5dac022;
float f;
*((int*) &f) = input_val;
unsigned short val = (unsigned short) f;
cout <<"Val = 0x" << std::hex << val << endl;
}
I build and run the code using the following command:
g++ main.cpp -o main
./main
When I following code in my normal PC, I get the correct answer which is 0xe4a8
. But when I run the same code on an ARM processor, it gives an output of 0x0
.
Is this happening because I am building the code with normal gcc instead of aarch64? The code gives correct output for some other test cases on the ARM processor but gives an incorrect output for the given test value. How can I solve this issue?
First, your "type pun" via pointers violates the strict aliasing rule, as mentioned in comments. You can fix that by switching to memcpy
.
Next, the bit pattern 0xc5dac022
as an IEEE-754 single precision float
corresponds to a value of about -7000, if my test is right. This is truncated to -7000, which, being negative, cannot be represented in an unsigned short
. As such, attempting to convert it to unsigned short
has undefined behavior, per [7.3.10 p1] in the C++ standard (C++20 N4860). Note this is different than the situation for trying to convert a signed or unsigned integer to unsigned short
, which would have well-defined "wrapping" behavior.
So there is no "correct answer" here. Printing 0 is a perfectly legal result, and is also logical in some sense, as 0 is the closest unsigned short
value to -7000. But it's also not surprising that the result would vary between platforms / compilers / optimization options, as this is common for UB.
There is actually a difference between ARM64 and x86-64 that explains why this is the particular behavior you see.
When compiling without optimization, in both cases, gcc emits instructions to actually convert the float
value to unsigned short
at runtime.
ARM64 has a dedicated instruction fcvtzu
that converts a float
to a 32-bit unsigned int
, so gcc emits that instruction, and then extracts the low 16 bits of the integer result. The behavior of fcvtzu
with a negative input is to output 0, and so that's the value that you get.
x86-64 doesn't have such an instruction. The nearest thing is cvttss2si
which converts a single-precision float
to a signed 32-bit integer. So gcc emits that instruction, then uses the low 16 bits of it as the unsigned short
value. This gives the right answer whenever the input float
is in the range [0, 65536)
, because all these values fit in the range of a 32-bit signed integer. GCC doesn't care what it does in all other cases, because they are UB according to the C++ standard. But it so happens that, since your value -7000
does fit in signed int
, then cvstss2si
returns the signed integer -7000, which is 0xffffe4a8
. Extracting the low 16 bits gives you the 0xe4a8
that you observed.
When optimizing, gcc on both platforms optimizes the value into a constant 0. Which is also perfectly legal.