I'm brand new to this R thing but am unsure which model to select.
-
I did a stepwise forward regression selecting each variable based on the lowest AIC. I came up with 3 models that I'm unsure which is the "best".
Model 1: Var1 (p=0.03) AIC=14.978 Model 2: Var1 (p=0.09) + Var2 (p=0.199) AIC = 12.543 Model 3: Var1 (p=0.04) + Var2 (p=0.04) + Var3 (p=0.06) AIC= -17.09
I'm inclined to go with Model #3 because it has the lowest AIC (I heard negative is ok) and the p-values are still rather low.
I've ran 8 variables as predictors of Hatchling Mass and found that these three variables are the best predictors.
-
My next forward stepwise I choose Model 2 because even though the AIC was slightly larger the p values were all smaller. Do you agree this is the best?
Model 1: Var1 (p=0.321) + Var2 (p=0.162) + Var3 (p=0.163) + Var4 (p=0.222) AIC = 25.63 Model 2: Var1 (p=0.131) + Var2 (p=0.009) + Var3 (p=0.0056) AIC = 26.518 Model 3: Var1 (p=0.258) + Var2 (p=0.0254) AIC = 36.905
thanks!
Best Answer
AIC is a goodness of fit measure that favours smaller residual error in the model, but penalises for including further predictors and helps avoiding overfitting. In your second set of models model 1 (the one with the lowest AIC) may perform best when used for prediction outside your dataset. A possible explanation why adding Var4 to model 2 results in a lower AIC, but higher p values is that Var4 is somewhat correlated with Var1, 2 and 3. The interpretation of model 2 is thus easier.