Search code examples
floating-pointieee-754

Negative decimal number / string to IEEE Single Precision Format


I am taking an Assembly Programming class, and having a terrible time understanding "IEEE Single Precision Format." Our text only has two sentences about this, and the few websites I have seen discuss it, were not overly helpful. In full disclosure, this is one of the questions on our weekly assignment and I just can't figure it out. I sort of understand normal numbers, but as soon as the assignment had a negative number, it just threw everything out the window.

"Use IEEE single precision format to encode the decimal number -1.25 in single-precision floating point. Show all your work for credit."

Any help on this would be greatly appreciated.


Solution

  • Your message suggests negative numbers are giving you trouble. Simply ignore that the number is negative, and convert 1.25 instead. Then flipping the top-bit will give you the result for -1.25. Floats are nice this way: To negate, all you need to do is flip the top bit.

    Details: Remember that a single-precision number has 32 bits. 1-bit for sign, 8-bit for exponent, and 23-bit for significand. For "normal" numbers, there's an implicit 1, i.e., the value is:

     (-1)^s * 1.significand * 2^exponent
    

    (The 1 in 1.significand is the "implicit" 1.)

    You need to figure out what needs to go for s, exponent and significand bits for your number; such that the above formula equals the number you want to represent. (More precisely: "closest" to it, since you'll have to round as not all numbers are accurately representable for cardinality reasons.)

    Sign bit is easy: We know it'll be 1 since your number is negative.

    Then there's 8-bits of exponent. This can be found by finding what power of 2 is "just below" your number. In this case 2^0 = 1 < 1.75 < 2 = 2^1, so the closest power is 0. IEEE stores exponents with a bias of 127, meaning instead of E, it'll store E+127. (There are good reasons for doing this, easily googleable.) So, the exponent will be 0+127 = 127, or 01111111.

    Then comes the 23 bit significand. For normal numbers, there's an implicit 1; meaning you need to simply represent 0.25 in 23 bits. Well, that's easy: After the dot powers go in negatives starting from one, so 010...0 will do, since 0*2^-1 + 1*2^-2 = 1/4 = 0.25.

    Putting it all together, you end up with: 10111111101000000000000000000000. Printed prettily, this is:

      ENCODED = -1.25 :: Float
                      3  2          1         0
                      1 09876543 21098765432109876543210
                      S ---E8--- ----------S23----------
       Binary layout: 1 01111111 01000000000000000000000
          Hex layout: BFA0 0000
           Precision: Single
                Sign: Negative
            Exponent: 0 (Stored: 127, Bias: 127)
      Classification: FP_NORMAL
              Binary: -0b1.01
               Octal: -0o1.2
             Decimal: -1.25
                 Hex: -0x1.4
       Rounding mode: RNE: Round nearest ties to even.
                Note: Conversion from "-1.25" was exact. No rounding happened.