AIC in Model Selection – Why It Yields Non-significant p-values

aicmodel selectionp-value

I have some questions about the AIC and hope you can help me. I applied model selection (backward, or forward) based on the AIC on my data. And some of the selected variables ended up with a p-values > 0.05. I know that people are saying we should select models based on the AIC instead of the p-value, so seems that the AIC and the p-value are two difference concepts. Could someone tell me what the difference is? What I understand so far is that:

  1. For backward selection using the AIC, suppose we have 3 variables (var1, var2, var3) and the AIC of this model is AIC*. If excluding any one of these three variables would not end up with a AIC which is significantly lower than AIC* (in terms of ch-square distribution with df=1), then we would say these three variables are the final results.

  2. A significant p-value for a variable (e.g. var1) in a three variable model means that the standardized effect size of that variable is significantly different from 0 (according to Wald, or t-test).

What's the fundamental difference between these two methods? How do I interpret it if there are some variables having non-significant p-values in my best model (obtained via the AIC)?

Best Answer

AIC and its variants are closer to variations on $R^2$ then on p-values of each regressor. More precisely, they are penalized versions of the log-likelihood.

You don't want to test differences of AIC using chi-squared. You could test differences of the log-likelihood using chi-squared (if the models are nested). For AIC, lower is better (in most implementations of it, anyway). No further adjustment needed.

You really want to avoid automated model selection methods, if you possibly can. If you must use one, try LASSO or LAR.