Solved – Why fit ARMA before GARCH if I am interested in variance of the data, not the residuals

arimagarchtime series

I have been working on a time series where after the first difference, I observe heteroskedasticity. To handle the situation, I found that ARCH/GARCH models are used typically.

When I read about the procedure, they say that the time series is first fitted with a conditional mean model like AR or ARMA and ARCH/GARCH model is applied to the residuals of the fitted AR/ARMA model.

My questions are:

  1. Why do we have to fit AR/ARMA?

  2. Why do we have to apply ARCH/GARCH to the residuals? Is that done to model the volatility in the residuals or the volatility in the actual data (differenced data)?

  3. If it is used to model the volatility of the residuals, how is that going to help in modeling the volatility in the actual data?

Best Answer

On the one hand, GARCH is a model for the conditional distribution of the time series $y_t$:

\begin{aligned} y_t &\sim d(\mu_t,\sigma_t^2), \\ \mu_t &= \dots \text{(e.g. some ARMA equation)} \\ \sigma_t^2 &= \omega + \alpha_1 ( y_{t-1} - \mu_{t-1} )^2 + \dotsc + \alpha_s ( y_{t-s} - \mu_{t-s} )^2 + \beta_1 \sigma_{t-1}^2 + \dotsc + \beta_r \sigma_{t-r}^2, \\ \varepsilon_t &:= \frac{y_t-\mu_t}{\sigma_t} \sim i.i.d(0,1). \\ \end{aligned}

GARCH specifically characterizes the conditional variance equation, but this inevitably depends on some equation for the conditional mean; otherwise the conditional variance would be undefined. This hopefully answers your questions 1. and 3.

On the other hand, GARCH happens to characterize the conditional variance of the residuals from the conditional mean model, $u_t$, as well. The exact same model as above can be represented as follows:

\begin{aligned} y_t &= \mu_t + u_t, \\ \mu_t &= \dots \text{(e.g. some ARMA equation)} \\ u_t &= \sigma_t \varepsilon_t, \\ \sigma_t^2 &= \omega + \alpha_1 u_{t-1}^2 + \dotsc + \alpha_s u_{t-s}^2 + \beta_1 \sigma_{t-1}^2 + \dotsc + \beta_r \sigma_{t-r}^2, \\ \varepsilon_t &\sim i.i.d(0,1), \\ \end{aligned}

and $\sigma_t^2$ is the conditional variance of the residual $u_t$ since $\text{Var}(u_t|I_{t-1})=\sigma_t^2$ where $I_{t-1}$ is information available at time $t-1$. This hopefully answers your question 2.