I am taking an Assembly Programming class, and having a terrible time understanding "IEEE Single Precision Format." Our text only has two sentences about this, and the few websites I have seen discuss it, were not overly helpful. In full disclosure, this is one of the questions on our weekly assignment and I just can't figure it out. I sort of understand normal numbers, but as soon as the assignment had a negative number, it just threw everything out the window.
"Use IEEE single precision format to encode the decimal number -1.25 in single-precision floating point. Show all your work for credit."
Any help on this would be greatly appreciated.
Your message suggests negative numbers are giving you trouble. Simply ignore that the number is negative, and convert 1.25
instead. Then flipping the top-bit will give you the result for -1.25
. Floats are nice this way: To negate, all you need to do is flip the top bit.
Details: Remember that a single-precision number has 32 bits. 1-bit for sign, 8-bit for exponent, and 23-bit for significand. For "normal" numbers, there's an implicit 1, i.e., the value is:
(-1)^s * 1.significand * 2^exponent
(The 1
in 1.significand
is the "implicit" 1.)
You need to figure out what needs to go for s
, exponent
and significand
bits for your number; such that the above formula equals the number you want to represent. (More precisely: "closest" to it, since you'll have to round as not all numbers are accurately representable for cardinality reasons.)
Sign bit is easy: We know it'll be 1
since your number is negative.
Then there's 8-bits of exponent. This can be found by finding what power of 2 is "just below" your number. In this case 2^0 = 1 < 1.75 < 2 = 2^1
, so the closest power is 0. IEEE stores exponents with a bias of 127, meaning instead of E, it'll store E+127. (There are good reasons for doing this, easily googleable.) So, the exponent will be 0+127 = 127, or 01111111
.
Then comes the 23 bit significand. For normal numbers, there's an implicit 1
; meaning you need to simply represent 0.25
in 23 bits. Well, that's easy: After the dot powers go in negatives starting from one, so 010...0
will do, since 0*2^-1 + 1*2^-2 = 1/4 = 0.25
.
Putting it all together, you end up with: 10111111101000000000000000000000
. Printed prettily, this is:
ENCODED = -1.25 :: Float
3 2 1 0
1 09876543 21098765432109876543210
S ---E8--- ----------S23----------
Binary layout: 1 01111111 01000000000000000000000
Hex layout: BFA0 0000
Precision: Single
Sign: Negative
Exponent: 0 (Stored: 127, Bias: 127)
Classification: FP_NORMAL
Binary: -0b1.01
Octal: -0o1.2
Decimal: -1.25
Hex: -0x1.4
Rounding mode: RNE: Round nearest ties to even.
Note: Conversion from "-1.25" was exact. No rounding happened.