Solved – the relationship between theta and size in negative binomial distribution

negative-binomial-distribution

In negative binomial regression glm.nb(y~x), I got a parameter theta and two coefficients? And then I want to use dnbinom(x, size, prob, mu, log = FALSE) to calculate the predicted probability.

can anyone show me what is the relationship between theta and size, and how to transfer between these two parameters?

If every point's value in a dataset is expanded two times, how does size or theta change?

Best Answer

The $theta from a fitted glm.nb() corresponds to the size in dnbinom(). As a simple example, let's replicate the fitted log-likelihood from scratch. Using the quine data from MASS:

library("MASS")
m <- glm.nb(Days ~ ., data = quine)
logLik(m)
## 'log Lik.' -546.5755 (df=8)

And this value of the log-likelihood can be obtained by summing the dnbinom(..., log = TRUE) values:

sum(dnbinom(quine$Days, mu = fitted(m), size = m$theta, log = TRUE))
## [1] -546.5755

Doubling the weight of all observations leaves all parameter estimates (including theta) unchanged:

quine$weights <- 2      
    m2 <- glm.nb(Days ~ ., data = quine, weights = weights)
    m$theta
## [1] 1.274893
m2$theta
[1] 1.274893