Solved – Overfitting when using corrected AIC for model selection

aicforecastingmodel selectiontime series

I am using the corrected AIC to select the lag order in a simple AR(p) model. I chose the the AICc since my sample is fairly small (n=135). The AICc minimal model is the AR(15). To me it seems like an overfit to include 15 lags in such a small sample.

Do you agree?
And does anyone know a rule of thumb for max lag order relative to sample size?

From the arfunction in the stats package in R:

order.max
Maximum order (or order) of model to fit. Defaults to the smaller of N-1 and 10*log10(N) where N is the number of observations except for method="mle" where it is the minimum of this quantity and 12.

To me these default make little sense. Where do they come from?

Best Answer

You need to use the likelihood for the whole sample from the first principles based on $$ \log L \sim -\frac12 ({\bf y}-\mathbf{\mu})'\Sigma(\theta)^{-1}({\bf y}-\mathbf{\mu}) - \frac12 \log |\Sigma(\theta)| $$ where ${\bf y}\in \mathbf{R}^{135} $ is your whole sample vector, and $\Sigma(\theta)$ is the model-implied covariance matrix of your ARMA($p,q$) process. God only knows what your naive AIC calculation for i.i.d. data is actually doing; it is out of context and has little value here.