Suppose the conditional mean of returns is constant. A GARCH model gives you a fitted value of the conditional variance for each data point. These fitted values can be used to weight the data points to construct an efficient estimate of the mean (e.g. using weighted least squares); data points with high fitted conditional variance would be down-weighted relative to data points with low fitted conditional variance.
Now suppose the conditional mean of returns is not constant. Then you would build a model for the conditional mean simultaneously with a GARCH model for the conditional variance. The effect of the GARCH model would again be similar to the case discussed above. The data points with high fitted conditional variance would be down-weighted relative to the points that have low fitted conditional variance when estimating the model for the conditional mean.
One example given by @CadgasOzgenc is an ARMA-GARCH model. A rich choice of specifications of ARIMAX models and different versions of (G)ARCH models can be implemented using "rugarch" package in R
(functions ugarchspec
, ugarchfit
).
Simultaneous estimation is efficient, but two-stage estimation could be done, too, if you can consistently estimate the conditional mean model in presence of conditionally heteroskedastic errors. First you would estimate the conditional mean model ignoring that the errors have a GARCH structure. Second, you would estimate a GARCH model on the residuals from the conditional mean model. Then you would reestimate the conditional mean model using the fitted conditional variances to weight the data points as discussed above. That could be done iteratively until convergence. For example, an AR-GARCH model could be estimated that way as an AR(p) model can be estimated consistently even in presence of GARCH errors. However, estimating an AR-GARCH model in one stage (simultaneously) would be more efficient.
In short, you should select models using AIC and/or out-of-sample fit criteria and view the rejected hypothesis as a suggestion to consider other types of models.
When using this class of time series models researchers are usually interested in accurate prediction\forecasting. Since AIC measures how well a model predicts the data in-sample, it operates as a fair means of model selection in this case (you may also want to test how well the models fit out-of-sampleā¦more on that below).
However, just because a particular model has the lowest AIC does not mean that that model is correctly specified or that it approximates the true data generating process well. It could be that all the models you proposed were poor choices, or that the true process FTSE follows is so complex that practically every reasonable model will be rejected given enough data. AIC provides no information on this point which is where hypothesis testing can come in.
Under the assumptions of standard ARMA-GARCH, the residuals should be homoscedastic and more generally iid normal. Your hypothesis test suggests that your residuals are not homoscedastic and, in turn, that your ARMA-GARCH model may be miss specified. On this note you may want to consider alternative specifications for the volatility process including other variants of GARCH models, i.e. EGARCH, GJR-GARCH, TGARCH, AVGARCH, NGARCH, GARCH-M, etc. and/or stochastic volatility models. It is highly likely that one of these models will offer a lower AIC value and produce residuals which cannot be rejected for homoscedasticity.
One important thing to note though is that no model will be perfect, especially for something like the FTSE 100. The true data generating process driving a large financial index like this is impossibly complex, so pretty much every model you propose will be false. For this reason, it can be argued that any meaningful hypothesis you do not reject is a reflection of insufficient data or lack of statistical power rather than evidence supporting one model over others.
One way to partially resolve this dilemma is to use out-of-sample fit as opposed to or in conjunction with AIC. A simple example would be to fit the model using only the first 80% or 90% of the data and using the resulting coefficient estimates to obtain a log-likelihood for the remaining 20%-10% portion of the data. The model with the highest log-likelihood would be preferred. If the ARMA-GARCH model is truly misspecified in a way that impairs its forecasting performance, then an out-of-sample fit will help expose it.
Best Answer
If you use the log returns, you're essentially making the assumption that there is no conditional variation in the mean. In some circumstances you may want to explicitly model both, but other times it may be sufficient to assume a constant mean and focus on the conditional variance. Depends on what you're trying to do.
In addition, if you fit a GARCH model with raw log returns, then you're also implicitly assuming the mean is zero. Centering the data may be important if the mean is large (i.e. especially in lower frequency data).