How can I get the "prob" parameter out of glm.nb()?

After generating negative binomial data with prob set equal to .007 I get that number back from the glm.nb() fit but only by cheating.

library(MASS)
counts<-data.frame(as.matrix(rnbinom(10000, prob = .007, size = 247)))
names(counts)<-"y"

head(counts)

fitted_model<-glm.nb(y ~ 1, data = counts, link="identity")

#Theta is the shape parameter of the negative binomial distribution. So this is "r".
r<-theta.ml(fitted_model$y, fitted(fitted_model))[1]      
# the parameter r is referred to as the “dispersion parameter” or “shape parameter”

mu<-coef(fitted_model) #This is the mean

# mu=prob*r/(1-prob) according to https://en.wikipedia.org/wiki/Negative_binomial_distribution
# so prob = 1/(r + mu) ?
1/(r + mu) # Wrong! This isn't the prob I used to generate th data!
r/(r + mu) # Right! But why does this get me the correct value of prob?

#This has hints:  http://www.wright.edu/~thaddeus.tarpey/ES714glm.pdf

I don't want to cheat to get the value of "prob" out of the fitted model. Can anyone explain why r/(r + mu) = prob?

Solution

If you compare Wikipedia's definition

C(k+r-1,k) (1-p)^r p^k

with the definition given in ?NegBinomial

Gamma(x+n)/(Gamma(n) x!) p^n (1-p)^x

you'll see that the roles of p and 1-p are switched; if we define NB as "probability of n successes occurring before one failure", then Wikipedia is defining p as the probability of "failure" while R is defining p as the probability of "success". I get the correct result from r/(r+mu) rather than mu/(r+mu) ...