GARCH Model – How to Choose the Order of a GARCH Model?

garchmodel selectionrtime series

In order to model time series with GARCH models in R, you first determine the AR order and the MA order using ACF and PACF plots. But then how do you determine the order of the actual GARCH model?

Ie. say you find ARMA(0,1) fits your model then you use:

garchFit(formula=~arma(0,1)+garch(1,1),data=XX,trace=FALSE,include.mean=FALSE)

I know GARCH(1,1) is the most widely used, but what's the best way to determine the order? AIC?

Best Answer

You should determine both the ARMA and the GARCH orders simultaneously.

If the process is indeed well approximated by an ARMA-GARCH model, considering the conditional mean model (ARMA) while neglecting the conditional variance model (GARCH) -- and this way (implicitly) assuming the conditional variance to be constant -- will lead to trouble. Similarly, when considering the conditional variance model you should not neglect the model for the conditional mean.

This is because neither the conditional mean model nor the conditional variance model can be estimated consistently if taken separately, unless under special conditions (e.g. if the MA part of the ARMA model is empty, the AR part can be estimated consistently even if the GARCH model is neglected). However, joint estimation will typically be more efficient and that is why it is preferred.

Unfortunately, the task of jointly determining the ARMA-GARCH orders is difficult, as I understand it. You may experiment with a few different models and compare their AICs or BICs. The larger the pool of candidate models, the more likely you are to overfit. If you have a large enough sample, you may try cross validation. That is,
(1) define a pool of candidate models,
(2) estimate the models on part of the sample,
(3) use the estimated models to predict the remainder of the sample,
(4) pick the model that has the lowest prediction error.
Still, this will not prevent overfitting if the pool of candidate models is quite large as compared with the sample size.

You can also examine the properties of the residuals of the different models. You would prefer a model with well-behaved residuals (no remaining autocorrelations, no remaining ARCH patterns, etc.). Again, this is subject to overfitting. It is easy to imagine data dredging or model mining when one is not stopping until one gets all the model coefficients to be significant. But can this result be trusted if achieved via data dredging? (Of course, not.)

I am sorry I do not have a better solution.

Related Question