Search code examples
binarybitfixed-point

What is the largest/smallest fixed point number than can be represented in 2n bits?


More specifically a binary number in 2n bits with n bits of integer (including one bit of sign) and n bits of fraction.

What would be the smallest and largest, positive, non-zero number we could represent?

I know how to deal with integer but not sure about the fractions.


Solution

  • A typical binary fixed-point representation is an integer scaled by a constant power of two so the factors involved are

    1. the range of the integer and
    2. the scale of the fixed-point type.

    Given a two's complement integer with a sign bit and 2n-1 digits, the range of positive numbers is [1..(2^(2n-1))-1] and the scale is 2^-n. So the minimum and maximum positive fixed-point values are [1*2^-n..((2^(2n-1))-1)*2^-n].

    For example, C's int16_t type has 15 digits so its range is [1..(2^15)-1] or [1..32767]. Here, n is 8 making the scale 2^-8 or 1/256. So the scaled range is [1/256..32767/256] or [0.00390625..127.99609375]. You can use this C++ program to calculate the range for different values of n using CNL.