Solved – Which method to choose when comparing non-nested models

rregression

I have 2 non-nested models which I would like to compare. Both models are based on the same dataset but use different predictors.

Model1 predictor A+B
Model2 predictor B+C

I know there are multiple tests available to select the "best" method:
1) jtest (Davidson-MacKinnon J test)
2) coxtest (Cox test)
3) encomptest (Davidson & MacKinnon)

All of the test are described in r for the comparison of non-nested models. However, which test is prefered?

If I understand the test correctly, all test say that Model1 is the best.

> coxtest(Model1,Model2)
Cox test

Model 1: group ~ A + B
Model 2: group ~ C + B
                Estimate Std. Error  z value Pr(>|z|)    
fitted(M1) ~ M2  -3.0809     3.1646  -0.9735   0.3303    
fitted(M2) ~ M1 -31.1339     2.0889 -14.9043   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


> jtest(Model1,Model2)
J test

Model 1: group ~ A + B
Model 2: group ~ C + B
                Estimate Std. Error t value  Pr(>|t|)    
M1 + fitted(M2)  0.18681    0.21166  0.8826    0.3786    
M2 + fitted(M1)  0.93740    0.13155  7.1257 2.149e-11 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


> encomptest(Model1,Model2, data=data)
Encompassing test
Model 1: group ~ A + B
Model 2: group ~ C + B
Model E: group ~ A + B + C
          Res.Df Df       F   Pr(>F)    
M1 vs. ME    188 -1  1.2402   0.2669    
M2 vs. ME    188 -1 24.3536 1.76e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Best Answer

For your particular problem, the answer would be: "None of these tests is appropriate."

The help pages for these 3 functions indicate that they all are expecting standard linear regression models (from lm() in R), implicitly with a continuous outcome variable.

In your case, however, your outcome variable is a choice between 2 groups, not a continuous outcome variable. Thus you should be using logistic regression instead of linear regression. Finding the "best" linear regression, as you seem to be attempting, could be far from finding the best model for predicting group membership.

There is considerable discussion on this site about the best ways to compare non-nested generalized linear models (like logistic regression). This page provides arguments for using the Akaike Information Criterion (AIC) to choose among non-nested models, while further providing a link to an informed difference of opinion.

The standard stats package in R has 2 functions to determine the AIC from models (AIC() and extractAIC()); this page shows a way to use them for comparing 2 models. Just be careful to stick with one or the other as the outputs from the 2 functions differ by additive constants.