I am learning the concept of residuals in modelling. I performed a Poisson GLM for a 3×3 contingency table and I got the summary of the model. My question is: the deviance residual from the in-built function resid
is differing from the residuals calculated manually using the formula 2 * (y * log(ifelse(y == 0, 1, y/mu)) - (y - mu))
. Why are the values different? What am I doing wrong here?
Example:
counts=c(0,1,2,2,0,8,8,13,16)
outcome=gl(3,1,9)
treatment=gl(3,3)
mod=glm(counts~outcome+treatment,family=poisson())
resid(mod,type="deviance")
## 1 2 3 4 5 6 7
##-1.0954451 0.1694307 0.3374099 0.0000000 -2.3664319 1.1368934 0.2176802
## 8 9
## 0.7886223 -0.7609947
fitted.values=mod$fitted.values
dev= 2*(counts*log(ifelse(counts==0,1,counts/fitted.values))-(counts-fitted.values))
## 1 2 3 4 5
## 1.200000e+00 2.870677e-02 1.138454e-01 -2.928413e-23 5.600000e+00
## 6 7 8 9
## 1.292527e+00 4.738466e-02 6.219251e-01 5.791129e-01
My guess is that I am giving a value 1 for all zero counts here. If I applied the formula without imputing a value 1 then the result shows infinite. How are the deviance residuals being calculated in glm?
Best Answer
You're calculating the deviance per point. The deviance residual is $\textrm{sign}(y-\mu) \sqrt{d_i}$. You need to take the square root (after setting values less than zero, like the tiny residual value for observation 4, to zero), then apply the sign of the difference: