Search code examples
pythonrstatistics

R `dnbinom` giving different results than Python `scipy.stats.nbinom.pmf` even after accounting for different parameterization


I am trying to translate my colleague's R code into Python. It involves making a calculation with a negative binomial distribution's probability mass function, but the issue is that R's dnbinom uses a differing parameterization from Python's scipy.stats.nbinom.pmf. According to this question and it's answer, given R's mu (mean) and size(dispersion), I should be able to get Scipy's n and p with the following code:

 p = 1 / (1 + size * mu)
 n = 1 / size

However, if I assume convert_params does the above calculation and apply it like this:

 from scipy.stats import nbinom
 n, p = convert_params(15, 0.463965)
 nbinom.pmf(3, n, p) 

I get 0.036, whereas if I do this in R:

 dnbinom(3, mu=15, size=0.463965)

I get 0.05.

Does anyone know what's going on here? Have I used an incorrect formula to change the parameterization?


Solution

  • Note that from R's documentation,

    size -- target for number of successful trials, or dispersion parameter (the shape parameter of the gamma mixing distribution). Must be strictly positive, need not be integer.

    and from python:

    n is the number of successes. The number of successes n may also be specified in terms of a "dispersion", "heterogeneity", or "aggregation" parameter. ...

    The mean mu is related to the probability of success as p = n/(n + mu)

    Thus you need to directly equate python's n to R's size:

    mu, size = 15, 0.463965
    scipy.stats.nbinom(n = size, p = 1/(1 + mu/size)).pmf(3)
    0.050034315664281
    

    dnbinom(3, mu=15, size=0.463965)
    [1] 0.05003432