Search code examples
pythonnumpypoisson

Creating vector with Poisson increments


If we start with a vector between 0 and 1 with M = 100 increments

z = np.linspace(0,10,M)

this vector has equal increments from 0 to 1.

I want to create a new vector where the increments z_{n+1}-z_n are distributed according to the Poisson distribution with parameter lambda. I tried this via a cumsum

lam = 10000
dz = np.random.poisson(lam, M)
z = np.cumsum(dz)

but I am not sure if this is correct? Would the increments of this new vector z have increments distributed via the Poisson distribution?

enter image description here


Solution

  • Thanks for updating, I understand the problem now. The answer is, no; you shouldn't expect the vector z to have its increments as a poisson distribution.

    To demonstrate why, let's create a bunch of different poisson distributions and add them together.

    a = np.random.poisson(1000, 200)
    b = np.random.poisson(1000, 200)
    c = np.random.poisson(1000, 200)
    d = np.random.poisson(1000, 200)
    e = np.random.poisson(1000, 200)
    
    plt.figure(figsize=(15, 10))
    plt.hist(a+b, bins=200)
    plt.hist(a+b+c, bins=200)
    plt.hist(a+b+c+d, bins=200)
    

    This looks like: enter image description here

    Cool, so we can see that the result appears to be still poisson distributed, but with the histograms being shifted higher and higher. Note as well the linear scaling of the lambda parameter. All the distributions had lambda=1000, and when I added 2 together the results looked like a distribution with lambda = 2000, when I added 3 it looked like lambda=3000.

    So, and this really is just looking at the issue approximately, it appears that adding vectors in this way retains the poisson behaviour, with increasing lambda values.

    Your use case, however, is that each interval must be a value drawn from the poisson distribution. Let's say our vector started as [lambda, 0, 0, ..., 0], where lambda means a value drawn from a poisson distribution with rate parameter lambda. To get the ith value, we add another value lambda drawn from a poisson, to the (i-1)th value. So our vector looks like [l, l+l, 0, ..., 0]. If we repeat this, our vector is:

    z = [l, l+l, l+l+l, ..., l+ ...n-1 times... l]
    

    This is most certainly NOT a poisson distribution. This is roughly equivalent to pulling one value from each of the histograms I plotted. The reason why your graph looks the way it does is because the later values in the array are very high, and, of course they are, the final value is the sum of 1000 values whose average is close to 10,000! A histogram plots equally spaced bins. If you set your number of bins too low, you'll get a thick block. If you set it too high, you'll get discrete blocks each containing a single count of value, position roughly at the i*lambda, where i is the index of the array element.

    One final point to note; you can't demand that the array starts with 0 and ends in 1, if the values in between are pulled from a distribution with average value 10, 000. Unless you want to do some normalisation.