Solved – Two different formulas for AICc

aicmodel selection

Wikipedia's page on AIC gives a formula for the AICc, a "corrected" version of the AIC that helps to avoid overfitting when the sample size is small relative to the number of parameters in the models being considered.

Wikipedia's formula is consistent with Burnham & Anderson (2002), with the correction term $2K(K+1)/(n-K-1)$, where $K$ is the number of parameters and $n$ is the sample size.

However, in every source I have found prior to Burnham & Anderson (2002), the correction term is stated as $2(K+1)(K+2)/(n-K-2)$. For example, see Anderson et. al. (1994).

Why are the two formulas different? Was there a mistake in the original derivation that was later corrected, or did the set of assumptions change at some point? I have not been able to find any explicit mention of the difference between these two formulas.

Best Answer

I wish I understood the paper better, but if you look at Hurvich & Tsai (1989) equation 3, they are defining the AIC itself as: $$ \textrm{AIC} = n(\log\hat{\sigma}^2 + 1) + 2\left(m + 1\right) $$

Which, naïvely implies $k = m+1$ and then the Hurvich & Tsai and Anderson et al post 1999 are actually one and the same as $$ (m+1 = k) \implies \frac{2(m+1)(m+2)}{n−m-2} \equiv \frac{2(k)(k+1)}{n−k−1} $$

Edit - (Cavanaugh 1997)

See (Cavanaugh 1997) (pdf), specifically page 203, where in the derivation he is setting $k = p+1$, for it is as @Glen_b said, $k$ includes the error variance and $p$ does not.

Reference: Cavanaugh, J. E. Unifying the derivations for the Akaike and corrected Akaike information criteria Statistics & Probability Letters, 1997, 33, 201-208

Best Answer

Edit - (Cavanaugh 1997)

Related Solutions

Solved – Using AIC to distinguish between models using multiple datasets

Solved – Corrected AIC (AICC) for k-means

Related Question