Search code examples
pythonnumpystatisticsroboticsnormal-distribution

Should samples from np.random.normal sum to zero?


I am working on the motion model of a robot. In every time step, the robot's motion is measured, then I sample the normal distribution with the measurement as the mean and a small sigma value for covariance in order to simulate noise. This noisy motion is then added to the robot's previous state estimate.

But when I keep the robot still, these noisy measurements seem to accumulate and the robot "thinks it's moving."

Shouldn't these random samples not accumulate, but sum to zero?

In other words, would you expect the following to be true:

0 ~ np.sum([np.random.normal(0, 0.1) for _ in range(1000)])

I have tried writing out the above in an explicit loop and seeding the random number generator with a different number before taking every sample, but the sums still deviate far from zero.

Is this simply a limitation of random number generators, or am I misunderstanding the fact(?) that many samples from the normal distribution should sum to zero?


Solution

  • The short answer to your question is no. Be careful not to conflate the sum of an array of independent random variables and the mean of those independent random variables.

    Per the article that @Hongyu Wang referenced in his comment, let's verify the following:

    "If X and Y are independent random variables that are normally distributed, then their sum is also normally distributed."

    Effectively, this is what you have done. You have created an array of independent random variables and taken their sum, which in turn, should be normally distributed.

    I have slightly modified your code to demonstrate:

    import random, numpy as np
    import seaborn as sns
    import matplotlib.pyplot as plt
    
    x = [np.sum([np.random.normal(0,0.1) for _ in range(1000)]) for _ in range(1000)]
    
    sns.distplot(x)
    plt.show()
    

    Which yields: enter image description here

    You can verify that your normal distribution is correctly distributed about a mean of 0, by doing:

    np.mean([np.random.normal(0, 0.1) for _ in range(1000)])