So, I tried to simulate mixed Poisson distribution:
data2 <- data.frame(x = c(rpois(n = 50, lambda = 0), rpois(n = 450, lambda = 10)))
Then plot the histogram and density function upon it, for the distribution function I used function dmixpois
from spatstat package. Here is the code for the plot:
ggplot(data2, aes(x = x)) +
geom_histogram(aes(y = ..density..), bins = 15) +
geom_line(aes(x = x, y = dmixpois(data2$x, mu = 9, sd = sqrt(41))))
Clearly, the density function is wrong. As far as I know, the mean for mixed Poisson distribution is linear combination of the means for the singled distributions and the variance is E[lambda] + Var[lambda]. Here on the plot I only used the variance term, but if I add the expected value of lambda, I get the density to be even more steep. What is wrong with the computations?
The 'mixed Poisson' that you are simulating isn't the same as the mixed Poisson model in spatstat
, which just assumes that the lambda of a Poisson distribution itself is a normally-distributed random variable. From the docs:
In effect, the Poisson mean parameter lambda is randomised by setting lambda = invlink(Z) where Z has a Gaussian N(μ,σ²) distribution.
It therefore won't simulate the mixing of two independent Poisson distributions.
It looks like what you are simulating is a zero-inflated Poisson distribution, that is to say, a distribution where there is a probability of getting zero counts or a Poisson-distributed count for any given observation. There is a specific dzipois
for this in the VGAM
package.
Remember also that a Poisson distribution is a discrete probability distribution, so you cannot properly show it with a continuous line, but rather only with points or spikes at the integer values.
If you want to plot a distribution that matches your simulation, you can try the following:
set.seed(1) # To make the example reproducible
data2 <- data.frame(x = c(rpois(n = 50, lambda = 0),
rpois(n = 450, lambda = 10)))
ggplot(data2) +
geom_histogram(aes(x = x, y = ..density..), breaks = 0:21 - 0.5) +
geom_point(data = data.frame(x =0:20, y = VGAM::dzipois(0:20, 10, 1/10)),
aes(x, y))