I have an IEEE-754 16-bit float that I'd like to losslessly pack as a 16-bit unsigned integer. The easiest way of course is to just pack its bytes and then unpack it, but the snag is that I need to compare the 16-bit integers afterwards (ie greater than, less than, etc) in my program. So I'm looking for an isomorphism between f16 and u16 that preserves order. Could anyone suggest an algorithm that does this? Thanks!
To maintain <, ==, >
of a float16 with integer math, treat the data as if it was a signed integer encodes using sign-magnitude.
Do this with float
and (u)int32_t
to get the code right (as float16_t
not well available to all) and then adjust for 16-bit.
Negate negative values to positive and set the MSBit for positive ones.
Make certain +0.0 and -0.0 convert to the same value.
// Assumes same endian for FP and integers
#include <float.h>
#include <limits.h>
#include <stdint.h>
#include <stdio.h>
// Assumes same endian for FP and integers
unsigned float_to_sequence(float f) {
union {
float f;
int32_t i;
uint32_t u;
} x = {.f = f};
if (x.i < 0) {
x.u = -x.u;
} else {
x.u |= 0x80000000;
}
return x.u;
}
Test
void test(float f) {
printf("%+-20a %+-18.9e ", f, f);
printf("0x%08X\n", float_to_sequence(f));
}
int main(void) {
float f[] = {-INFINITY, -FLT_MAX, -1.0, -FLT_TRUE_MIN, -0.0, //
0.0, FLT_TRUE_MIN, 1.0, FLT_MAX, INFINITY};
size_t n = sizeof f / sizeof f[0];
for (size_t i = 0; i < n; i++) {
test(f[i]);
}
}
Output
-inf -inf 0x00800000
-0x1.fffffep+127 -3.402823466e+38 0x00800001
-0x1p+0 -1.000000000e+00 0x40800000
-0x1p-149 -1.401298464e-45 0x7FFFFFFF
-0x0p+0 -0.000000000e+00 0x80000000
+0x0p+0 +0.000000000e+00 0x80000000
+0x1p-149 +1.401298464e-45 0x80000001
+0x1p+0 +1.000000000e+00 0xBF800000
+0x1.fffffep+127 +3.402823466e+38 0xFF7FFFFF
+inf +inf 0xFF800000
The conversion is one-one except for +0.0 and -0.0 both convert to the same value - as it should.
For a 16-bit one liner: uint16_t y = (x & 0x8000) ? -x : (x | 0x8000);