I am trying to translate my colleague's R code into Python. It involves making a calculation with a negative binomial distribution's probability mass function, but the issue is that R's dnbinom
uses a differing parameterization from Python's scipy.stats.nbinom.pmf
. According to this question and it's answer, given R's mu
(mean) and size
(dispersion), I should be able to get Scipy's n
and p
with the following code:
p = 1 / (1 + size * mu)
n = 1 / size
However, if I assume convert_params
does the above calculation and apply it like this:
from scipy.stats import nbinom
n, p = convert_params(15, 0.463965)
nbinom.pmf(3, n, p)
I get 0.036
, whereas if I do this in R:
dnbinom(3, mu=15, size=0.463965)
I get 0.05
.
Does anyone know what's going on here? Have I used an incorrect formula to change the parameterization?
Note that from R's documentation,
size -- target for number of successful trials, or dispersion parameter (the shape parameter of the gamma mixing distribution). Must be strictly positive, need not be integer.
and from python:
n is the number of successes. The number of successes n may also be specified in terms of a "dispersion", "heterogeneity", or "aggregation" parameter. ...
The mean mu is related to the probability of success as
p = n/(n + mu)
Thus you need to directly equate python's n
to R's size
:
mu, size = 15, 0.463965
scipy.stats.nbinom(n = size, p = 1/(1 + mu/size)).pmf(3)
0.050034315664281
dnbinom(3, mu=15, size=0.463965)
[1] 0.05003432