Solved – Compare model fit for model trained on different data

goodness of fitrregression

I have two datasets, where I'm predicting y from x using a basic linear model for each dataset. Both models are significant, but have different results.

In dataset A, the model estimate for x is 1.5, but r-squared is only 0.2.
In dataset B, the model estimate for x is 0.6, but r-squared is 0.7.

Obviously, model for data B fits the data better according to its r-squared value. But I want to statistically quantify this goodness of fit (rather than just using the larger r-squared to pick the better model).

The datasets are different sizes and the models are not nested, so I can't compare them like I would if I were testing e.g. the null model against the full model. I'm not interested at all in the estimates, so combining the data into one dataset and testing the interaction doesn't give me what I'm looking for (at least I don't think it does).

What is the appropriate statistical test to test whether model B vs. model A better fits y in for its dataset?

Best Answer

What you are looking for is the goodness-of-fit measure of a statistical model. These measures summarize between the observed values and expected values (from the model). Depending on certain conditions, you could use the AIC, Bayesian Information Criterion, etc.

If you are looking for more than just goodness-of-fit measure, you could use methods involving the model mimicry. Model mimicry is a concept where one model tries to account for the data generated by the other model. The "better" fit model of the two normally accounts for its own data and to a certain extent accounts for the data generated by the competing model. (Note that, the two models should be competing models).

Look at this paper where Wagenmakers et al. describe model selection by quantifying the model mimicry. This intuitive procedure can be easily coded in R.

Hope it helps!

Related Solutions

Solved – How to test for goodness of fit for a logistic regression model

You are on the right track, ROC is a common error measure for logistic regression models. More often, the Area Under The Receiver Operating Curve (AUROC) is used. The advantage is that this measure is numeric and can be compared to other validation runs / model setups of your logistic regression.

You can, for example, use cross-validation to asses the performance of your model. As this goodness of fit depends highly on your training and test sets, it is common to use many repetitions with different training and tests sets. At the end, you have a somewhat stable estimation of your model fit taking the mean of all repetitions.

There are several packages providing cross-validation approaches in R. Assuming you have a fitted model, you can e.g. use the sperrorest package with the following setup:

nspres <- sperrorest(data = data, formula = formula, # your data and formula here
                     model_fun = glm, model_args = list(family = "binomial"), 
                     pred_fun = predict, pred_args = list(type = "response"), 
                     smp_fun = partition_cv, 
                     smp_args = list(repetition = 1:50, nfold = 10))

summary(nspres$pooled.err$train.auroc)   
summary(nspres$pooled.err$test.auroc)

This will perform a cross-validation using 10 folds, 50 repetitions and give you a summary of the overall mean repetition error.

Solved – n alternative to R squared to compare goodness of fits of different datasets? Slope makes them incomparable

You may want to consider a measure of accuracy that measures the distance between the line and the data. There are a variety of these types of measures, maybe including Mean Absolute Error, Mean Square Error, or Root Mean Square Error.

The following is an example in R. The amount of vertical error for model and model2 are the same, but model has zero slope and zero r-squared, while model2 has an obvious slope and a high r-squared. You can compare the MAE, MSE, or RMSE statistics. (Caveat: I am the author of the accuracy function.)

if(!require(rcompanion)){install.packages("rcompanion")}
library(rcompanion)

X = c(1,2,3,4,5,6,7,8,9,10)
Y = c(5,6,4,5,5,5,5,4,6,5)

model = lm(Y ~ X)

plot(Y ~ X)

accuracy(list(model), plotit=F)

    ### Min.max.accuracy MAE   MAPE MSE  RMSE NRMSE.mean NRMSE.median NRMSE.mean.accuracy NRMSE.median.accuracy Efron.r.squared CV.prcnt
    ###            0.927 0.4 0.0833 0.4 0.632      0.126        0.126               0.874                 0.874               0     12.6           

X = c(1,2,3,4,5,6,7,8,9,10)
Z = X + c(5,6,4,5,5,5,5,4,6,5)

model2 = lm(Z ~ X)

plot(Z ~ X)

accuracy(list(model2), plotit=F)

   ### Min.max.accuracy MAE   MAPE MSE  RMSE NRMSE.mean NRMSE.median NRMSE.mean.accuracy NRMSE.median.accuracy Efron.r.squared CV.prcnt
   ###            0.961 0.4 0.0418 0.4 0.632     0.0602       0.0602                0.94                  0.94           0.954     6.02

Best Answer

Related Solutions

Solved – How to test for goodness of fit for a logistic regression model

Solved – n alternative to R squared to compare goodness of fits of different datasets? Slope makes them incomparable

Related Question