Search code examples
pythonrandomieee-754proof

Finding 0 - (Python Testing) random.uniform(0,1) Double Precision Floating Point


This has been making my head hurt. I have also seen a detailed post on stack overflow about this situation that is a few years old and I would like to highlight some of the flaws from the discussions. I will highlight them in the next section. But the problem is outlined below.

I am trying to find out if the random.uniform(0,1) function from the random library,

import random

can indeed output a value of 0 given a range greater than 0 excluding special case random.uniform(0,0) (range here is 0).

Special Case

>>>random.uniform(0,0) 
0.0

Test

>>>random.uniform(0,1)
0.11689643963821128

Upon doing some research about IEEE 754 standard for double precision floating point numbers which are used in python for this function, I have not been able to prove that 0 cannot be a result of random.uniform(x,y) where (x!=y and x,y !=0)See expectation.

But this leads me to another question why does IEEE754 standard round to 0 if and only if it is 0, but not any other number unless proving otherwise obviously.

I want to say it's impossible, but wondering what you guys think.

I have used this website to check for a few numbers too but with no luck.

I have tried using various loops some even give me outputs closer to 1 when using this function, I need to do some more testing before backing this up, but using values like that on the website provided above, provide weird values.

The link to the old stackoverflow thread can be found here

One of the responses suggest that

>>> random.uniform(0., 5e-324)
0.0

will indeed give you a value of 0 however it can also give you this (is it a 50% chance?)

>>> random.uniform(0., 5e-324)
5e-324

other times it has shown to me to be infinite in VS Code Debugger when used in a loop until reaching 0.0 for some reason upon a random interval.

So this also begs the question does python really process 324 digits of precision using the uniform specification without needing to specify that we really want a range between 0 and 5e-324. With an integer range it only produces about 16 digits of precision.

There are some other points that I have read through on the older thread, but I am struggling to relate here to it.

So can random.uniform(0,1) actually produce 0.0? If you have tried experimenting or researching this in other languages, please share your thoughts and results.

(I may be making more edits to this question to make implement further testing or clearing and further explaining to some points made)


Solution

  • Python does not have a formal specification, and the behavior of random.uniform is not well specified. Given a and b with a ≤ 0 < b, I expect random.uniform(a, b) should be able to return zero, based on my minimum quality expectations. And, since it supports inverted ranges, it should also be able to return zero with a and b with b < 0 ≤ a.

    Discussion in the documentation suggest that random.uniform(a, b) is computed as a + (b-a) * random.random(). It also says random.random() returns a number in [0, 1). This is an incomplete specification for random.random(), and my comments about that for JavaScript/ECMAScript are here. My suspicion would be that random.random() prepares a 53-bit integer drawn from a uniform distribution2, converts it to the IEEE-754 binary64 (“double precision”) format and divides it by 253.

    If random.uniform(a, b) is computed as a + (b-a) * random.random(), then floating-point rounding can, in general, produce b as a result, so the output range is not purely [a, b). The underlying mathematics, if computed with infinite precision, would never produce b as a result, but, because a limited-precision format is used, the roundings in the multiplication and in the addition can result in b being produced.

    One of the responses suggest that … random.uniform(0., 5e-324) … will indeed give you a value of 0 however it can also give you this (is it a 50% chance?)

    When 5e-324 is converted to IEEE-754 binary64, the result is 2−1074, approximately 4.940656•10−324. There are no representable values between 0 and 2−1074. Computing random.uniform(0, 5e-324) as described above yields 0 + (2−1074−0)•random.random() rounded to the nearest representable value. This equals 2−1074random.random() rounded to the nearest representable value. If random.random() is in [0, ½], this will round to 0.1 If it is in (½, 1), it will round to 2−1074. There are 252+1 representable numbers in [0, ½] and 252−1 representable numbers in (½, 1). So the chance of getting 0 is not 50%; it is slightly greater, (252+1)/253 = ½+2−53.

    So this also begs the question does python really process 324 digits of precision…

    No. Under the assumptions above, a Python implementation takes 53 bits for random.random() and then computes a + (b-a)*random.random(). When you evaluate random.uniform(0., 5e-324), the small result is caused by scaling after the random number is selected for random.random(). It is not present in the selection of the random number.

    Footnotes

    1 Mark Dickinson confirmed this for CPython in a comment: “We draw two 32-bit unsigned ints from the underlying Mersenne Twister generator, then combine the top 27 bits of the first with the top 26 bits of the second (probably out of latent superstition about lack of randomness in low-order bits of PRNGs, which certainly applies to LCGs, but not to Mersenne Twister),” also linking to the source code:

    static PyObject *
    _random_Random_random_impl(RandomObject *self)
    /*[clinic end generated code: output=117ff99ee53d755c input=afb2a59cbbb00349]*/
    {
        uint32_t a=genrand_uint32(self)>>5, b=genrand_uint32(self)>>6;
        return PyFloat_FromDouble((a*67108864.0+b)*(1.0/9007199254740992.0));
    }
    

    2 In the border case, 2−1074•½ would yield 2−1075 with real-number arithmetic. This result is exactly halfway between the two nearest representable values, 0 and 2−1074. The default rounding rule says, in case of ties, to round to the neighbor with the even low digit, so 0 is chosen.