Search code examples
cfixed-point

Understanding Fixed-point fractional part


So I'm trying to wrap my head around Fixed-point numbers. So far so good. The only thing that got me confused is the 'fractional' part of the number.

My understanding of Fixed-point numbers is that it splits the binary number according to the scale (which in this case is equal to eight bits). The left side would be the integer part, and the fraction is to the right. XXX.Y where XXX are the three bytes for the integer/whole part, and Y is one byte for the fractional part (please correct me if I'm wrong).

Let's take the following macros:

#define FIX_SCALE 8
#define FIX_FRACTION_MASK ((1 << FIX_SCALE) - 1)
#define FIX_WHOLE_MASK ~FIX_FRACTION_MASK
#define FIX_FROM_FLOAT(X) ((X) * (1 << FIX_SCALE))
#define FIX_TO_FLOAT(X) ((float)(X) / (1 << FIX_SCALE))
#define FIX_TO_INT(X) ((X) >> FIX_SCALE)
#define FIX_FROM_INT(X) ((X) << FIX_SCALE)
#define FIX_FRACTION(X) ((X) & FIX_FRACTION_MASK)
#define FIX_WHOLE(X) ((X) & FIX_WHOLE_MASK)

Consider the following example:

int Fixed = FIX_FROM_FLOAT(2.5f);

The resulting integer value is 640, 0x280 in hex and 0000 0000 0000 0000 0000 0010 1000 0000 in binary.

Let's take the first two bytes: 0000 0010 1000 0000

I understand where the 0000 0010 comes from, it's the integer part (2). But what I don't get is the 1000 0000 which is the fractional part. I just don't see how that relates to the number 5 (which is 0101 in binary). I would have expected something like 0101 0000 or 0000 0101 -- Clearly I'm misunderstanding a fundamental concept here.

If I write:

int Fraction = FIX_FRACTION(Fixed);

I would get 128 (0x80 in hex. Makes sense cause it masked out the integer part which is 2). The first time I wrote that I expected to get a 5 back.

I do get 0.5 if I write:

float Fraction = FIX_TO_FLOAT(FIX_FRACTION(Fixed));

Could somebody clear this confusion for me? Why did the fraction number 0000 1000 not have any 101 in it? Why did we have to do the FIX_TO_FLOAT on FIX_FRACTION to get the right fraction?

Thanks.


Solution

  • Comparing number patterns in decimal and binary representation does not work. Lets forget about fixed point numbers for a moment and have a look at the binary representations of 5 and 50:

     5: 0000'0101
    50: 0011'0010
    

    As you can see, the binary pattern of a decimal 5 cannot be found in the binary representation of decimal 50 either.

    Now to understand why decimal 0.5 is ..00'1000'0000 in Q23.8 binary, you need to follow the binary to decimal conversion rule:

    Replace every 1 with 2^position and add up the numbers

    position:      7 6 5 4  3 2 1 0  -1-2-3-4 -5-6-7-8
    binary number: 0 0 0 0  0 0 1 0 . 1 0 0 0  0 0 0 0
    

    2^1 + 2^-1 = 2 + 0.5 = 2.5