Search code examples
python-3.xnumpypytorchprobability-distribution

How the standard normal distribution works in practice in NumPy and PyTorch?


I have two points to ask about:

1) I would like to understand what is precisely returned from the np.random.randn from NumPy and torch.randn from PyTorch. They both return a tensor with random numbers from a normal distribution with mean 0 and std 1, hence, a standard normal distribution. However, it is not the same thing as puting x values in the standard normal distribution function here and getting its respective image values y. The values returned by PyTorch and NumPy does not seem like this.

For me, it seems that both np.random.randn and torch.randn from these libraries returns the x values from the functions, not the image y as I calculated below. Is that correct?

normal = np.array([(1/np.sqrt(2*np.pi))*np.exp(-(1/2)*(i**2)) for i in range(-38,39)])

Printing the normal variable shows me something like this.

array([1.10e-314, 2.12e-298, 1.51e-282, 3.94e-267, 3.79e-252, 1.34e-237,
       1.75e-223, 8.36e-210, 1.47e-196, 9.55e-184, 2.28e-171, 2.00e-159,
       6.45e-148, 7.65e-137, 3.34e-126, 5.37e-116, 3.17e-106, 6.90e-097,
       5.52e-088, 1.62e-079, 1.76e-071, 7.00e-064, 1.03e-056, 5.53e-050,
       1.10e-043, 8.00e-038, 2.15e-032, 2.12e-027, 7.69e-023, 1.03e-018,
       5.05e-015, 9.13e-012, 6.08e-009, 1.49e-006, 1.34e-004, 4.43e-003,
       5.40e-002, 2.42e-001, 3.99e-001, 2.42e-001, 5.40e-002, 4.43e-003,
       1.34e-004, 1.49e-006, 6.08e-009, 9.13e-012, 5.05e-015, 1.03e-018,
       7.69e-023, 2.12e-027, 2.15e-032, 8.00e-038, 1.10e-043, 5.53e-050,
       1.03e-056, 7.00e-064, 1.76e-071, 1.62e-079, 5.52e-088, 6.90e-097,
       3.17e-106, 5.37e-116, 3.34e-126, 7.65e-137, 6.45e-148, 2.00e-159,
       2.28e-171, 9.55e-184, 1.47e-196, 8.36e-210, 1.75e-223, 1.34e-237,
       3.79e-252, 3.94e-267, 1.51e-282, 2.12e-298, 1.10e-314])

2) Also, if we ask these libraries that I want a matrix of values from a standard normal distribution, it means that all rows and columns are draw from the same standard distribution? If I want i.i.d distributions in every row, I would need to call np.random.randn over a for loop for each row and then vstack them?


Solution

  • 1) Yes, they give you x and not phi(x) since the formula for phi(x) gives the probability density of sampling a value x. If you want to know the probability of getting values in an interval [a,b] you need to integrate phi(x) between a and b. Intuitively, if you look at the function phi(x) you'll see that you're more likely to get values near zero than, say, values near 1. An easy way to see it, is look at the histogram of the sampled values.

    import numpy as np
    import matplotlib.pyplot as plt    
    samples = np.random.normal(size=[1000])
    plt.hist(samples)
    

    enter image description here

    2) they're iid. Just use a 2d size like so:

    samples = np.random.normal(size=[10, 10])