Solved – Pearson and deviance GOF test for logistic regression in SAS and R

deviancegoodness of fitlogisticrsas

I've been trying to fit exactly the same logistic regression model (same data) in SAS and R. As far as the coefficients are concerned I didn't notice any differences.
However, when I tried to perform some of the Goodness of fit tests (Pearson residuals and Deviance residuals GOF tests ) I noticed there is huge difference on how they are computed.
It's hard to bring in some reproducible data here but that's my output:

  1. R

    1 – pchisq(deviance(modelx),df.residual(modelx))

[1] 0.0003661318

1 – pchisq(sum(residuals(modelx, type = "pearson")^2),df.residual(modelx))

[1] 0.4574779

deviance(modelx)

[1] 3284.208

df.residual(modelx)

[1] 3015

sum(residuals(modelx, type = "pearson")^2)

[1] 3022.632

While in SAS its:

Criterion | Value | DF | Value/DF | Pr. > chi-sq.

Deviance | 2347.8792 | 2116 | 1.1096 | 0.0003

Pearson | 2126.1138 | 2116 | 1.0048 | 0.4343

the probabilities are similar but values and the degrees of freedom are completely different.

I've read that both the statistic and DF in SAS are calculated using "profiles" (http://support.sas.com/resources/papers/proceedings14/1485-2014.pdf, page 3) but I still don't understand how those profiles are calculated – I have 7 predictors in my data, each with 3,4,5,5,5,6,6 categories – or why one would use profiles at all.

Any ideas?

Best Answer

As has been discussed elsewhere on this site, the deviance statistic does not have a $\chi^2$ distribution. Any statistic where the d.f. increases as the sample size has a degenerate distribution.

For goodness of fit set up easy directed hypotheses such as linearity and additivity or use the 1 d.f. test in the R rms package residuals.lrm function.

Related Question