Search code examples
pythonnumpydeep-learning

When to use numpy.random.randn(...) and when numpy.random.rand(...)?


In my deep learning exercise I had to initialize one parameter D1 of same size as A1 so what I did is:

D1 = np.random.randn(A1.shape[0],A1.shape[1]) 

But after computing further equations when I checked the results they didn't matched. Then after proper reading the docs I discovered that they have said to initialize D1 using rand() instead of randn():

D1 = np.random.rand(A1.shape[0],A1.shape[1]) 

But they didn't specified the reason for it as the code is working in both the cases. And also there was a doc for that exercise so I figured out the error, but how, when and why to choose out of these two?


Solution

  • The difference between rand and randn is (besides the letter n) that rand returns random numbers sampled from a uniform distribution over the interval [0,1), while randn instead samples from a normal (a.k.a. Gaussian) distribution with a mean of 0 and a variance of 1.

    In other words, the distribution of the random numbers produced by rand looks like this:

    Uniform distribution

    In a uniform distribution, all the random values are restricted to a specific interval, and are evenly distributed over that interval. If you generate, say, 10000 random numbers with rand, you'll find that about 1000 of them will be between 0 and 0.1, around 1000 will be between 0.1 and 0.2, around 1000 will be between 0.2 and 0.3, and so on. And all of them will be between 0 and 1 — you won't ever get any outside that range.

    Meanwhile, the distribution for randn looks like this:

    Normal distribution

    The first obvious difference between the uniform and the normal distributions is that the normal distribution has no upper or lower limits — if you generate enough random numbers with randn, you'll eventually get one that's as big or as small as you like (well, subject to the limitations of the floating point format used to store the numbers, anyway). But most of the numbers you'll get will still be fairly close to zero, because the normal distribution is not flat: the output of randn is a lot more likely to fall between, say, 0 and 0.1 than between 0.9 and 1, whereas for rand both of these are equally likely. In fact, as the picture shows, about 68% of all randn outputs fall between -1 and +1, while 95% fall between -2 and +2, and about 99.7% fall between -3 and +3.

    These are completely different probability distributions. If you switch one for the other, things are almost certainly going to break. If the code doesn't simply crash, you're almost certainly going to get incorrect and/or nonsensical results.