If I want to model the volatility of stock return would fitting a GARCH to it be an appropriate method?
GARCH models are frequently used for modelling stock price volatility, so there is nothing wrong with trying to fit such a model. You can later examine how well it fits the data and whether its assumptions are satisfied to decide whether to keep the model or to look for an alternative.
say if I have a GARCH(2,1), then would the regression on my (y)t look like: <...>
A GARCH(2,1) model looks like this:
\begin{aligned}
r_t &= \mu_t + u_t, \\
u_t &= \sigma_t \varepsilon_t, \\
\sigma_t^2 &= \omega + \alpha_1 u_{t-1}^2 + \alpha_2 u_{t-2}^2 + \beta_1 \sigma_{t-1}^2, \\
\varepsilon_t &\sim i.i.d.(0,1),
\end{aligned}
where $\mu_t$ is the conditional mean of $r_t$ which could be e.g. a constant or an ARMA process.
And then I choose the lags based on the the AIC, which can be obtained by using summary
on the code above. Is this valid, or is there a better method for choosing GARCH lags?
Model selection is a difficult task and there is no silver bullet that always works. Besides, you may want to select different models for different purposes. For example, if you are interested in forecasting, choosing a model that has the lowest AIC value among the candidate models is a sensible strategy. But you should also look at model diagnostics (mainly residual diagnostics: how close they are to an i.i.d. sequence and how close their empirical distribution is to the assumed distribution) to see how good it is in absolute terms (AIC selects the best model for forecasting, but it does not tell how good the best model is).
I would presume alpha and beta are coefficients for (e^2)t-1 and (σ^2)t-1 correct? and mu and omega are constants? Are these coefficients for the y(t) or something else?
The notation I used above is mostly in line with the output of the GARCH model you have included in your post. So omega
, alpha1
, alpha2
and beta
are coming from the conditional variance equation while mu
comes from the conditional mean equation.
we assume the errors have a certain distribution, in this case a normal one. The failure of the tests would indicate that the assumption is not met. <...> what effect could this have on my model?
You could not trust the standard errors (and thus p-values and significance test) of the estimated coefficients. But if the true distribution is sufficiently well behaved, your estimator can be treated as a quasi maximum likelihood estimator (QMLE). Then you can use robust standard errors in place of vanilla standard errors (robust standard errors are reported as part of default estimation output in the R package "rugarch"), and then the p-values and significance tests are fine.
a reasonable way to solve this would be to change the distribution, since mine if fat tailed with high peak, then a t distribution would be better than this correct?
Yes, you could try that. Just remember that you are interested in the distribution of standardized residuals $\varepsilon_t$, not raw residuals $u_t$. (Sometimes $u_t$ will have heavy tails but $\varepsilon_t$ will not.)
Descriptive statistics such as mean, sdev, skewness, and kurtosis are not as useful for prices as they are for returns. The reason is that the price data generating process is not stable; instead, the price distribution varies from day to day. Hence, a "global" measure does not necessarily refer to anything useful about "the distribution" of prices. For example, if the price is trending upward, then the average price will seriously underestimate the means of the future price distributions.
Return distributions are not perfectly stable, but they are much more stable than prices, so their descriptive statistics are more relevant for future predictions.
On a related note, a gross mis-use of statistics is to perform any standard statistical method (t-interval etc) on prices, because the fundamental assumption that the observations are independent and identically distributed is grossly violated for prices. Generally, prices are highly autocorrelated (nearly random walk in many cases, where the autocorrelation is ~1.0). On the other hand, autocorrelations in returns are usually small (~0.0); and if not, there is a violation of market efficiency.
Best Answer
When designing a GARCH model, we are making an assumption on standardized errors of stock returns, not the stock returns themselves. GARCH as a structure generates heavy-tailed outputs (even) from Normal inputs. Thus leptokurtic stock returns are compatible with Normal standardized errors. Nevertheless, a stylized fact from the stock markets is that even the standardized errors tend to be heavy-tailed, although less so than the stock returns.
If a normal distribution is assumed, the MLEs can (under some not-too-stringent conditions) be treated as QMLEs; the estimators are consistent but have higher variances than under the correct distribution. I guess the normal distribution is computationally convenient and thus is often used as a quick-fix solution.