Search code examples
castingfloating-pointintegercompiler-construction

How does casting from integers to floating-point numbers work?


Assuming 32-bit values (int32_t, float), they are stored in memory as follows:

// 255
int:   11111111 00000000 00000000 00000000 (big endian)
int:   00000000 00000000 00000000 11111111 (little endian)
float: 0 11111111 000000000000000000000

By this point it's fairly obvious that the memory itself is arranged differently, depending on the interpreted type.

Further assuming a standard C-style cast, how is this achieved? I usually work with x86(_64) and ARMHF CPUs, but I'm not familiar with their respective assembly languages or the way the CPUs are organised internally, so please excuse if this would be answered fairly simply by knowing the internals of these CPUs. Primarily of interest, are how C/++ and C# handle this cast.

  • Does the compiler generate instructions which interpret the sign-bit and the exponent portion and just converts them over to a memory structure representing an integer, or is there some magic going on in the background?
  • Do x86_64 and ARMHF have built-in instructions to handle this sort of thing?
  • Or: does a C-style cast simply copy the memory and it's up to the runtime to interpret whatever value pops out (seems unlikely, but I may be mistaken)?

The suggested posts Why are floating point numbers inaccurate? and Why can't decimal numbers be represented exactly in binary? do help with understanding basic concepts of floating-point math, but do not answer this question.


Solution

  • If int: 11111111 00000000 00000000 00000000 (big endian) is showing us the bytes in memory order (lowest address to highest address), then that is little endian, not big endian: The least significant bits of 255, 11111111, are in the low address, and the most significant bits, 00000000, are in the high address.

    float: 0 11111111 000000000000000000000 is not the encoding of 255 in the format most commonly used for float, IEEE-754 binary32. The bits would be 0 10000110 11111110000000000000000 (437F000016, or, when stored little-endian, 00 00 7F 43). The exponent code of 134 represents an exponent of 134−127 = 7, and the significand field represents 1.11111112 = 1.9921875, so the entire value represented is +1.9921875•27 = 255.

    A compiler will generate whatever instructions it needs to work with values. Typically, processors with hardware support for floating-point have different instructions, and often different registers, for integer and floating-point values. To work with an int, the compiler will generate instructions to load it into a general register and integer-arithmetic instructions to operate on it. To work with a float, the compiler will generate instructions to load it into a floating-point instructions and floating-point-arithmetic instructions to operate on it.

    If the hardware does not have hardware floating-point support, the compiler generates instructions to interpret and process the bits representing the float in ways necessary to produce the correct results. Much of this is done by calling routines from a library of software-floating-point routines. Inside those routines, the instructions break down the parts of a floating-point representation, do computations as necessary, and reassemble the parts to produce floating-point results.

    x86_64 has built-in floating-point instructions,.

    ARMHF has built-in floating-point instructions; the HF stands for Hardware Floating-point or Hard Float. (I do not have information that that is an official ARM designation; it may be colloquial.)

    When you cast an int to float or vice-versa in C, the compiler uses a built-in instruction to perform the conversion (unless optimization provides another solution), if the hardware has such an instruction. The hardware instruction manipulates the bits of the representation to compute the result. If the hardware does not have an instruction for this, the compiler generates whatever instructions it needs, likely calling a routine from a library as above.

    C implementations that support mixed big-endian and little-endian types are rare. However, if supported, the compiler would simply swap bytes as needed. Some hardware may assist with this with instructions that either swap bytes as words are loaded and stored or that swap bytes in registers.