Search code examples
cgccesp32

Where and how is the accuracy lost by doing in C integer multiplication and division and storing the result in integer or float?


For a calculation in a C program, running on ESP32, I have to multiply and divide the following integers in the following way :

150 × 10000 ÷ 155 ÷ 138 × 100 ÷ 220 × 100 which produces 3100.000000 for a float variable and 3100 for a 32-bit unsigned integer.

I tried to test the result of the calculation on https://www.onlinegdb.com/ using the following code :

int main () {
    float calc = 150 * 10000 / 155 / 138 * 100 / 220 * 100 ;
    printf ( "calc = %f\n", calc ) ;    // 3100.000000

    uint32_t calc0 = 150 * 10000 / 155 / 138 * 100 / 220 * 100 ;
    printf ( "calc0 = %u\n", calc0 ) ;  // 3100
}

which again produces 3100.000000 for the float and 3100 for the 32-bit unsigned integer.

If I enter the same numbers in the calculator on my handy or laptop, the result is in both cases 3187,555782226.

So, I have an accuracy loss on the ESP32 of ( if I haven't messed up the formula ) ca. (3187−3100)÷3187×100 ~= 2,73 %

Where and how does the difference come from and is it possible to get the exact result on a 32-bit microcontroller as on the PC ?


Solution

  • You're not losing any precision on your calculator or mobile device. The precise result is 3187.5557822261889583067701... and your mobile device is approximating that quite accurately.

    The problem is that in the expression 150 * 10000 / 155 / 138 * 100 / 220 * 100, all multiplications and divisions are between integers. Even if you use this expression to initialize a float, it's too late then; the precision is already lost.

    To get a more precise result, make the first operand a float by adding a .f suffix, and all operations will be between floating point numbers then:

    #include <stdio.h>
    #include <stdint.h>
    #include <inttypes.h>
    
    int main(void) {
        float calc = 150.f * 10000 / 155 / 138 * 100 / 220 * 100;
        printf( "calc = %f\n", calc );
    
        uint32_t calc0 = 150.f * 10000 / 155 / 138 * 100 / 220 * 100;
        // note: PRIu32 expands to the correct format specifier for uint32_t
        printf( "calc0 = %" PRIu32 "\n", calc0 );
    
        // tip: we can use unsigned long long (ull suffix) and shift all of the
        //      multiplications to the start to minimize precision loss
        //      (unsigned long long is needed to prevent overflow)
        uint32_t calc1 = 150ull * 10000 * 100 * 100 / 155 / 138 / 220; // 1)
        printf( "calc1 = %" PRIu32 "\n", calc1 );
    }
    

    Prints:

    calc = 3187.555420
    calc0 = 3187
    calc1 = 3187
    

    See live example


    1) Instead of / 155 / 138 / 120, we can also write / (155ull * 138 * 120). The result is guaranteed to be the same.