Search code examples
doublefixed-point

Fixed point <-> Double precision


I need insight on signed conversion.

An input to a C routine is guaranteed to lie in the range -Pi to pi. This is passed by "double" data type.

Since my device lacks a FPU, I must perform math (Divide input by Pi )on the input using fixed point on a 24bit register.

Since the range is known, I am able to store the input data in Q3.10 format.

So:

 typedef int Q3_10;
 Q1_10 ReciProc_Pi = 0x145;
 Q3_10 FP_DataQ310 = (Q3_10)(Input_Data * 1024); /* Scale by 2^10 */
 Q4_20 FP_DataQ420 = FP_DataQ310 * ReciProc_Pi;  /* Input / Pi */
 Q4_23 FP_DataQ423 = FP_DataQ420 << 3; /* Change 4Q20 to 4Q23 */
 Q1_23 FP_DataQ123 = FP_DataQ423 & 0xFFFFFF; /* Retain 24 bits */

I am able to verify printing out fraction using the venerable printf.

 double Fraction = (double)((double)FP_DataQ123 / (65536 * 128));
 printf("Fraction = %f\n", Fraction);

So for an input such as 2.5678, the fraction is correctly identified as 0.81483.

Are negative numbers to be treated the same?

Computation above fails when I pass -1.757. The printf reports a positive fraction 1.442410. But the hexadecimal value of FP_DataQ123 seems allright (0xB8a0e8). The sign bit is correctly set to 1.

I later did this:

 Q1_23 FP_DataQ123 = FP_DataQ423 & 0xFFFFFF; /* Retain 24 bits */
 if (FP_DataQ123 >> 23)
  {
    printf("Input is negative\n");
    FP_DataQ123 = (~FP_DataQ123) & 0x7FFFFF; /* Complement */ 
    double Fraction = (double)((double)FP_DataQ123 / (65536 * 128));
    printf("Fraction = %f\n", Fraction);
  }

The printf now reports correct fraction 0.557, but without the minus sign.

How do I get printf to print -0.557 without complementing FP_DataQ123?


Solution

  • So it turns out that all of the integer bits had to be sign extended.

    Q1_23 FP_DataQ123 = FP_DataQ423 & 0xFFFFFF; /* Retain 24 bits */
    

    Although this is correct because Bit-24 is a 1 indicating negative 1Q23 fraction, printf needed everything after Bit-23 to be sign extended.

    Q1_23 FP_DataQ123 = FP_DataQ423 ; 
    

    does the trick. This need for sign extension is fair enough because we are asking it to print a 64bit number and not a 24bit number.