Solved – Quasi-poisson or negative binomial regression with continuous dependent variable

count-datanegative-binomial-distributionoffsetpoisson-regressionquasi-likelihood

My dependent variable is originally count data. Because of several corrections it became continuous variable (originaly my data are pellet-group counts (for estimating deer density), corrected for research plot slope, avereged across three seasons, converted to deer density (number per sq. km)). Can I still apply quasi-Poisson or negative binomial GLM? Especially nb-GLM appear to be appropriate for my data structure (histogram and mean/variance relationship are adequate) and R processes the models without any warning and gives reasonable results. Could there be any catch? Rounding the data to integer does not seem to be solution since I would lose part of information and all corrections would be meaningless.

Best Answer

Rounding your response variable to an integer is NOT OK. For simplicity, lets assume you're conducting a Poisson regression. What you're modeling is the following:

$ \begin{align*} E(Y|x) &= \beta^{T}x + \beta_{0} \\[0.5em] \log \left( \frac{\mbox{No. of Deer}}{\mbox{Area}} \right) &= \beta^{T}x + \beta_{0} \\[0.5em] \log(\mbox{No. of Deer}) - \log(\mbox{Area}) &= \beta^{T}x + \beta_{0} \\[0.5em] \log(\mbox{No. of Deer}) &= \beta^{T}x + \beta_{0} + \log(\mbox{Area}) \end{align*} $

In R, this is done use the following command:

glm(No. of Deer ~ x + offset(log(Area)), family=poisson(link=log), data=data.frame)

This allows you to use Poisson (or Quasi-Poisson or Negative Binomial) regression for a continuous response, even though the No. of Deer is still a count. Your parameter estimates (i.e., $\beta_{0}$ and $\beta \mbox{s}$) will be on the log scale, so just exponentiate them to obtain estimates on the raw scale. Also, be careful of the parametrization used for the negative binomial distribution, if you decide to go with negative binomial regression.