Solved – Measure of explained variance for Poisson GLM (log-link function)

generalized linear modelpoisson distributionr-squared

I am looking for an appropriate measure of the "explained variance" of a Poisson GLM (using a log-link function).

I have found a number of different resources (both on this site and elsewhere) that discuss a number of different pseudo-$R^2$ measures, but nearly every site mentions the measures in relation to a logit-link function, and they don't discuss whether the pseudo-$R^2$ measures are appropriate for other link functions, such as log-link for my Poission distribution GLM.

For example, here are a few of the sites I've found:

Which pseudo-$R^2$ measure is the one to report for logistic regression (Cox & Snell or Nagelkerke)?

http://thestatsgeek.com/2014/02/08/r-squared-in-logistic-regression/

http://www.ats.ucla.edu/stat/mult_pkg/faq/general/Psuedo_RSquareds.htm

My question is: Are any of the methods discussed at those links (in particular, the FAQ on the UCLA page) appropriate for a Poission GLM (using a log-link function)? Is any particular method more appropriate and/or standardly used than any other method?

Some background:

This is for a research paper in which I am using a Poission GLM to analyze neural data. I am using the deviances of the models (calculated assuming a Poission distribution) to compare two models: One model (A) which includes 5 parameters that were left out of the other model (B). My interest (and the focus of the paper) is to show that that 5 parameters statistically improve the model fit. However, one of the reviewers would like an indication of how well both models fit the data.

If I were using OLS to fit my data the reviewer is effectively asking for the $R^2$ value for both the model with the 5 parameters and w/o the 5 parameters, to indicate how well either model explains the variance. It seems like a reasonable request to me. Lets say that, hypothetically, model B has an $R^2$ of 0.05 and model A has an $R^2$ of 0.25: even though that may be a statistically significant improvement, neither model does a good job of explaining the data. Alternatively, if model B has an $R^2$ of 0.5 and model A has an $R^2$ of 0.7, that could be interpreted in a very different way. I'm looking for the most appropriate measure that can be applied in a similar way to my GLM.

Best Answer

McCullagh and Nelder 1989 (page 34) give for the deviance function $D$ for the Poisson distribution:

$$ D = 2 \sum\left(y \log\left(\frac{y}{\mu} \right) + (y-\mu)\right) $$

where y represents your data and $\mu$ your modelled output. I use this function to estimate the explained deviance $ED$ of a GLM with Poisson distribution like this:

$$ ED = 1 - \frac{D}{\text{total deviance}} $$

where total deviance is given by the same equation for $D$ but using the mean of $y$ (a single number, i.e., $\mathrm{mean}(y)$) instead of the array of modelled estimates $\mu$.

I do not know if this is 100% correct, it sounds logical for me and seems to work as you would expect an estimate of the explained deviance to work (it gives you 1 if you use $\mu = y$, etc).