Search code examples
ccastingfloating-pointieee-75432-bit

Difference in casting float to int, 32-bit C


I currently working with an old code that needs to run a 32-bit system. During this work I stumbled across an issue that (out of academic interest) I would like to understand the cause of.

It seems that casting from float to int in 32-bit C behaves differently if the cast is done on a variable or on an expression. Consider the program:

#include <stdio.h>
int main() {
   int i,c1,c2;
   float f1,f10;
   for (i=0; i< 21; i++)  {
      f1 = 3+i*0.1;
      f10 = f1*10.0;
      c1 = (int)f10;
      c2 = (int)(f1*10.0);
      printf("%d, %d, %d, %11.9f, %11.9f\n",c1,c2,c1-c2,f10,f1*10.0);
   }
}

Compiled (using gcc) either directly on a 32-bit system or on a 64-bit system using the -m32 modifier the output of the program is:

30, 30, 0, 30.000000000 30.000000000
31, 30, 1, 31.000000000 30.999999046
32, 32, 0, 32.000000000 32.000000477
33, 32, 1, 33.000000000 32.999999523
34, 34, 0, 34.000000000 34.000000954
35, 35, 0, 35.000000000 35.000000000
36, 35, 1, 36.000000000 35.999999046
37, 37, 0, 37.000000000 37.000000477
38, 37, 1, 38.000000000 37.999999523
39, 39, 0, 39.000000000 39.000000954
40, 40, 0, 40.000000000 40.000000000
41, 40, 1, 41.000000000 40.999999046
42, 41, 1, 42.000000000 41.999998093
43, 43, 0, 43.000000000 43.000001907
44, 44, 0, 44.000000000 44.000000954
45, 45, 0, 45.000000000 45.000000000
46, 45, 1, 46.000000000 45.999999046
47, 46, 1, 47.000000000 46.999998093
48, 48, 0, 48.000000000 48.000001907
49, 49, 0, 49.000000000 49.000000954
50, 50, 0, 50.000000000 50.000000000 

Hence, it is clear that a difference exists between casting a variable and an expression. Note, that the issue exists also if float is changed to double and/or int is changed to short or long, also the issue do not manifest if program is compiled as 64-bit.

To clarify, the issue that I'm trying to understand here is not about floating-point arithmetic/rounding, but rather differences in memory handling in 32-bit.

The issue were tested on:

  • Linux version 4.15.0-45-generic (buildd@lgw01-amd64-031) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)), program compiled using: gcc -m32 Cast32int.c

  • Linux version 2.4.20-8 ([email protected]) (gcc version 3.2.2 20030222 (Red Hat Linux 3.2.2-5)), program compiled using: gcc Cast32int.c

Any pointers to help me understand what is going on here are appreciated.


Solution

  • With MS Visual C 2008 I was able to reproduce this.

    Inspecting the assembler, the difference between the two is an intermediate store and fetch of a result with intermediate conversions:

      f10 = f1*10.0;          // double result f10 converted to float and stored
      c1 = (int)f10;          // float result f10 fetched and converted to double
      c2 = (int)(f1*10.0);    // no store/fetch/convert
    

    The assembler generated pushes values onto the FPU stack that get converted to 64 bits and then are multiplied. For c1 the result is then converted back to float and stored and is then retrieved again and placed on the FPU stack (and converted to double again) for a call to __ftol2_sse, a run-time function to convert a double to int.

    For c2 the intermediate value is not converted to and from float and passed immediately to the __ftol2_sse function. For this function see also the answer at Convert double to int?.

    Assembler:

          f10 = f1*10;
    fld         dword ptr [f1] 
    fmul        qword ptr [__real@4024000000000000 (496190h)] 
    fstp        dword ptr [f10] 
    
          c2 = (int)(f1*10);
    fld         dword ptr [f1] 
    fmul        qword ptr [__real@4024000000000000 (496190h)] 
    call        __ftol2_sse
    mov         dword ptr [c2],eax 
    
          c1 = (int)f10;
    fld         dword ptr [f10] 
    call        __ftol2_sse
    mov         dword ptr [c1],eax