Search code examples
floating-pointfixed-point

How to convert packed integer (16.16) fixed-point to float?


How to convert a "32-bit signed fixed-point number (16.16)" to a float?

Is (fixed >> 16) + (fixed & 0xffff) / 65536.0 ok? What about -2.5? And -0.5?

Or is fixed / 65536.0 the right way?

(PS: How does signed fixed-point "-0.5" looks like in memory anyway?)


Solution

  • I assume two's complement 32 bit integers and operators working as in C#.

    How to do the conversion?

    fixed / 65536.0
    

    is correct and easy to understand.


    (fixed >> 16) + (fixed & 0xffff) / 65536.0
    

    Is equivalent to the above for positive integers, but slower, and harder to read. You're basically using the distributive law to separate a single division into two divisions, and write the first one using a bitshift.

    For negative integers fixed & 0xffff doesn't give you the fractional bits, so it's not correct for negative numbers.

    Look at the raw integer -1 which should map to -1/65536. This code returns 65535/65536 instead.


    Depending on your compiler it might be faster to do:

    fixed * (1/65536.0)
    

    But I assume most modern compilers already do that optimization.

    How does signed fixed-point "-0.5" looks like in memory anyway?

    Inverting the conversion gives us:

    RoundToInt(float*65536)
    

    Setting float=-0.5 gives us: -32768.