Search code examples
cdecimalfixed-point

Trouble in implementing fixed-point numbers in C


I am trying to make a small fixed-point math library. My fixed point numbers are 32-bit, with 16 bits each for the integral and fractional parts. The trouble comes with adding fixed-point numbers and then seeing the resulting value. The function fixed_from_parts below takes an integral and fractional part, and emits a fixed-point number, so fixed_from_parts(5, 2) would equal 0000000000000101.0000000000000010.

When adding two numbers, as seen in the main function below, it seems that the integral parts are added as one number, and the fractional part is added as another (5.2 + 3.9 incorrectly becomes 8.11, because 5 + 3 == 8 and 2 + 9 == 11). I think that I need to reverse the order of the bits stored in the fractional part, but I'm not quite sure how to do that. Am I overcomplicating this? How do I make addition work correctly?

#include <stdint.h>
#include <stdio.h>

typedef int16_t integral_t;
typedef int32_t fixed_t;

fixed_t int_to_fixed(const integral_t x) {
    return x << 16;
} 

integral_t fixed_to_int(const fixed_t x) {
    return x >> 16;
}

// shifts right (clears integral bits), and then shifts back
integral_t get_fixed_fractional(const fixed_t x) {
    return (integral_t) x << 16 >> 16;
}

// fixed_from_parts(5, 2) == 5.2
fixed_t fixed_from_parts(const integral_t integral, const integral_t fractional) {
    return int_to_fixed(integral) + fractional;
}

void print_fixed_base_2(const fixed_t x) {
    for (int i = (sizeof(fixed_t) << 3) - 1; i >= 0; i--) {
        putchar((x & (1 << i)) ? '1' : '0');
        if (i == sizeof(fixed_t) << 2) putchar('.');
    }
    putchar('\n');
}

void print_fixed_base_10(const fixed_t x) {
    printf("%d.%d\n", fixed_to_int(x), get_fixed_fractional(x));
}

int main(void) {
    // 5.2 + 3.9 = 9.1
    const fixed_t a = fixed_from_parts(5, 2), b = fixed_from_parts(3, 9);

    print_fixed_base_2(a);
    print_fixed_base_2(b);

    const fixed_t result = a + b;

    print_fixed_base_2(result);
    print_fixed_base_10(result); // why is the result 8.11?
}

Solution

  • Your one is not a fixed point.

    Example:

    #define MULT    (1 << 16)
    
    #define MAKE_FIXED(d)  ((int32_t)(d * MULT))
    #define MAKE_REAL(f)   (((double)(f)) / MULT)
    
    int32_t mulf(int32_t a, int32_t b)
    {
        int64_t part = (int64_t)a * b;
        return part/MULT;
    }
    
    int32_t divf(int32_t a, int32_t b)
    {
        int64_t part = ((int64_t)a * MULT) / b;
        return part;
    }
    
    
    int main(void)
    {
        int32_t num1 = MAKE_FIXED(5.2);
        int32_t num2 = MAKE_FIXED(3.9);
    
    
        printf("%f\n", MAKE_REAL(num1 + num2));
        int32_t result = mulf(num1, num2);
        printf("%f\n", MAKE_REAL(result));
        result = divf(num1,num2);
        printf("%f\n", MAKE_REAL(result));
    }