Solved – Optimal lag order selection for a GARCH model

garchlagsmodel selectiontime series

My research is forecasting petrol demand. I want to fit a GARCH model. I am using a sample of 260 weekly observations. My data set has only one variable.

  1. Is there a method to find the optimal lag for the GARCH model?

Edit: I used "fGarch" package in R to fit a GARCH(1,1) model. Here is the output:

> summary(fit)

Title:
 GARCH Modelling 

Call:
 garchFit(formula = ~garch(1, 1), data = OriData) 

Mean and Variance Equation:
 data ~ garch(1, 1)
<environment: 0x000000002202df90>
 [data = OriData]

Conditional Distribution:
 norm 

Coefficient(s):
        mu       omega      alpha1       beta1  
 477.60999  2827.32970     0.48594     0.42162  

Std. Errors:
 based on Hessian 

Error Analysis:
        Estimate  Std. Error  t value Pr(>|t|)    
mu      477.6100     10.4644   45.641   <2e-16 ***
omega  2827.3297   1455.8710    1.942   0.0521 .  
alpha1    0.4859      0.1805    2.692   0.0071 ** 
beta1     0.4216      0.1950    2.162   0.0306 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Log Likelihood:
 -1651.535    normalized:  -6.352057 

Description:
 Mon Oct 05 14:30:13 2015 by user: DELL 


Standardised Residuals Tests:
                                Statistic p-Value     
 Jarque-Bera Test   R    Chi^2  0.1384022 0.933139    
 Shapiro-Wilk Test  R    W      0.965667  7.004156e-06
 Ljung-Box Test     R    Q(10)  522.7621  0           
 Ljung-Box Test     R    Q(15)  586.3901  0           
 Ljung-Box Test     R    Q(20)  614.9063  0           
 Ljung-Box Test     R^2  Q(10)  3.697788  0.9599522   
 Ljung-Box Test     R^2  Q(15)  5.138439  0.990888    
 Ljung-Box Test     R^2  Q(20)  7.750981  0.9933912   
 LM Arch Test       R    TR^2   4.631041  0.9691821   

Information Criterion Statistics:
     AIC      BIC      SIC     HQIC 
12.73488 12.78966 12.73442 12.75690

> one=residuals(fit, standardize = FALSE)
> Box.test(one,lag=1)

        Box-Pierce test

data:  one
X-squared = 180.1844, df = 1, p-value < 2.2e-16

All coefficients are significant. $p$-values of Jarque-Bera test and ARCH-LM test are greater than 0.05.

  1. Can I use this as a good model?
  2. How can I test normality of residuals?

Best Answer

A few methods that could be applied for GARCH order selection:

  1. Just use the good old GARCH(1,1). Hansen & Lunde "Does anything beat a GARCH(1,1)?" compared a large number of parametric volatility models in an extensive empirical study. They found that no other model provides significantly better forecasts than the GARCH(1,1) model.
    (However, Ghalanos argues for the opposite in his blog post "Does anything NOT beat the GARCH(1,1)?", illustrating the case with empirical examples. Also, Reschenhofer asks "Does Anyone Need a GARCH(1,1)?" and shows that simple robust estimators such as weighted medians of past (squared) returns outperform the GARCH(1,1) model both in-sample as well as out-of-sample. Note that intraday data is considered in this paper but it might not be available in practice.)
  2. Estimate all possible subset models of a GARCH($p$,$q$) model with $p$, $q$ somewhat large (but not too large -- so that the computations would still be feasible) and choose the best according to an information criterion; use AIC if the model is intended for forecasting; use BIC if the model is intended for explanatory modelling. Also note that when the pool of models gets increasingly larger, AIC and BIC tend to select models that overfit; see Hansen "A winner’s curse for econometric models: on the joint distribution of in-sample fit and out-of-sample fit and its implications for model selection".
  3. Estimate a bunch of models as in 2. and look at the properties of their residuals. Test for no autocorrelation (perhaps with a Ljung-Box test) and for no remaining ARCH patterns (with a Li-Mak test), and maybe more. This will not be the best method for forecasting as you will likely choose a model that is not parsimonious enough; but it could be fine for explanatory modelling where it is the model bias rather than the estimation variance that is to be minimized, as per Shmueli "To Explain or to Predict" (p. 293).

There are a couple of related questions, here and here, which you may want to check out.

Edit: answering the added questions,

  1. Your model is not entirely "good" because there seems to be high autocorrelation in the standardized model residuals -- see the Ljung-Box test results for (non-squared) residuals. You could consider modelling the conditional mean of the series (e.g. using some ARIMA model) together with the conditional variance of the series (e.g. using a GARCH model, just as you did) to eliminate the autocorrelations.
  2. Two normality tests are actually reported in the output you have presented; it is the Jarque-Bera test suggesting normality and Shapiro-Wilk test suggesting non-normality. So you have conflicting results. Since Jarque-Bera test is only based on skewness and kurtosis, it might not capture other moments that are causing the non-normality, which Shapiro-Wilk test is capturing. In sum, I would be cautious and look more carefully at the residuals. Try the Q-Q (quantile-quantile) plot, kernel density estimation and/or other means to see how far from normality the standardized residuals seem to be.