I am trying to model and forecast the volatility of stock returns and I am not sure if what I am doing is correct. My issue is mostly with GARCH and its application. If I want to model the volatility of stock return, would fitting a GARCH to it be an appropriate method? I realize the model isn't going to be great, but this is mostly practice, if you will.
Furthemore, if I would fit a GARCH on stockreturns, say if I have a GARCH(2,1), then would the regression on my (y)t look like:
NOTE: t and t-1 represent time period. a, b and d are coefficients c is a constant.
(y)t = c + a*(σ^2)t + (e)t
or
(y)t = c + (e)t
? If its the latter, then how does GARCH enter into the regression?
While the (σ^2)t formula would be:
(σ^2)t = c + a*(e^2)t-1 + b*(e^2)t-2 + d*(σ^2)t-1
Assuming I am correct about fitting a GARCH into the stock returns, then to model the volatility of the stock I could use the following command:
garchFit(formula = ~garch(2,1), data = StockReturns$DowJonesReturns)
And then I choose the lags based on the the AIC, which can be obtained by using summary() on the code above. Is this valid, or is there a better method for choosing GARCH lags?
Title:
GARCH Modelling
Call:
garchFit(formula = ~garch(2, 1), data = StockReturns$DowJonesReturns)
Mean and Variance Equation:
data ~ garch(2, 1)
<environment: 0x1461eef8>
[data = StockReturns$DowJonesReturns]
Conditional Distribution:
norm
Coefficient(s):
mu omega alpha1 alpha2 beta1
0.066543 0.030719 0.062670 0.086457 0.824586
Std. Errors:
based on Hessian
Error Analysis:
Estimate Std. Error t value Pr(>|t|)
mu 0.066543 0.015325 4.342 1.41e-05 ***
omega 0.030719 0.005464 5.622 1.88e-08 ***
alpha1 0.062670 0.021139 2.965 0.003031 **
alpha2 0.086457 0.026001 3.325 0.000884 ***
beta1 0.824586 0.017826 46.257 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Log Likelihood:
-3405.385 normalized: -1.339648
Description:
Sat May 13 20:16:02 2017 by user:
Standardised Residuals Tests:
Statistic p-Value
Jarque-Bera Test R Chi^2 207.7602 0
Shapiro-Wilk Test R W 0.9849516 9.114147e-16
Ljung-Box Test R Q(10) 21.76658 0.01633891
Ljung-Box Test R Q(15) 30.04193 0.0117712
Ljung-Box Test R Q(20) 32.01694 0.04311811
Ljung-Box Test R^2 Q(10) 6.949068 0.7302444
Ljung-Box Test R^2 Q(15) 11.35256 0.7272252
Ljung-Box Test R^2 Q(20) 17.23784 0.6374799
LM Arch Test R TR^2 6.966663 0.8598087
Information Criterion Statistics:
AIC BIC SIC HQIC
2.683230 2.694718 2.683222 2.687397
Finally, looking at the final output, I'm not completely sure what it is I've actually done, mainly due to my lack of understanding of how garch enters the equation in the first place. I would presume alpha
and beta
are coefficients for (e^2)t-1 and (σ^2)t-1 correct? and mu
and omega
are constants? Are these coefficients for the y(t) or something else?
Best Answer
GARCH models are frequently used for modelling stock price volatility, so there is nothing wrong with trying to fit such a model. You can later examine how well it fits the data and whether its assumptions are satisfied to decide whether to keep the model or to look for an alternative.
A GARCH(2,1) model looks like this: \begin{aligned} r_t &= \mu_t + u_t, \\ u_t &= \sigma_t \varepsilon_t, \\ \sigma_t^2 &= \omega + \alpha_1 u_{t-1}^2 + \alpha_2 u_{t-2}^2 + \beta_1 \sigma_{t-1}^2, \\ \varepsilon_t &\sim i.i.d.(0,1), \end{aligned} where $\mu_t$ is the conditional mean of $r_t$ which could be e.g. a constant or an ARMA process.
Model selection is a difficult task and there is no silver bullet that always works. Besides, you may want to select different models for different purposes. For example, if you are interested in forecasting, choosing a model that has the lowest AIC value among the candidate models is a sensible strategy. But you should also look at model diagnostics (mainly residual diagnostics: how close they are to an i.i.d. sequence and how close their empirical distribution is to the assumed distribution) to see how good it is in absolute terms (AIC selects the best model for forecasting, but it does not tell how good the best model is).
The notation I used above is mostly in line with the output of the GARCH model you have included in your post. So
omega
,alpha1
,alpha2
andbeta
are coming from the conditional variance equation whilemu
comes from the conditional mean equation.You could not trust the standard errors (and thus p-values and significance test) of the estimated coefficients. But if the true distribution is sufficiently well behaved, your estimator can be treated as a quasi maximum likelihood estimator (QMLE). Then you can use robust standard errors in place of vanilla standard errors (robust standard errors are reported as part of default estimation output in the R package "rugarch"), and then the p-values and significance tests are fine.
Yes, you could try that. Just remember that you are interested in the distribution of standardized residuals $\varepsilon_t$, not raw residuals $u_t$. (Sometimes $u_t$ will have heavy tails but $\varepsilon_t$ will not.)