As I understand the inclusion of the GARCH term, $\sigma^2$, in a GARCH model allows for an infinite number of time series terms, $\epsilon^2$, to influence the conditional variance. Is this the case? How does this characteristic enable the GARCH model to be more parsimonious than the ARCH model?
Solved – GARCH vs ARCH models – which is more parsimonious
garchparameterizationtime series
Related Solutions
Strict stationarity is the strongest form of stationarity. It means that the joint statistical distribution of any collection of the time series variates never depends on time. So, the mean, variance and any moment of any variate is the same whichever variate you choose. However, for day to day use strict stationarity is too strict. Hence, the following weaker definition is often used instead. Stationarity of order 2 which includes a constant mean, a constant variance and an autocovariance that does not depend on time. (second-order stationary or stationary of order 2). A weaker form of stationarity that is first-order stationary which means that the mean is a constant function of time, time-varying means to obtain one which is first-order stationary.
Using traditional stationarity tests such us PP.test (Phillips-Perron Unit Root Test), kpss test or Augmented Dickey-Fuller Tests are not adequate if you are to perform regression via other methods than ARIMA (due that in Arima the orders are fixed and that no other factors that produce non stationarity are included in the model). For non Arima cases stationarity tests in the frequency domain are more adequate.
Tests in the frequency domain : The Priestley-Subba Rao (PSR) test for nonstationarity (fractal package). Based upon examining how homogeneous a set of spectral density function (SDF) estimates are across time, across frequency, or both.
The test you refer to is a test also in the frequency domain (which tests a second order unit root test) where the wavelet looks at a quantity called βj(t) which is closely related to a wavelet-based time-varying spectrum of the time series (it is a linear transform of the evolutionary wavelet spectrum of the locally stationary wavelet processes of Nason, von Sachs and Kroisandt, 2000). So we see if βj(t) function varies over time or is constant by looking at Haar wavelet coefficients of the estimate so is stationary if all haar coefficients are zero (locits package).
There are other concerns about stationarity such us long range dependence, fractional integrated processes (ARFIMA) where the term d (differenciation) refers to long term memory processes.
The effect of higher order non stationarity, long term dependencies is that they are in effect reflected systematically in the errors of a regression, however its impact and thus validity of the regression is difficult to measure
Why bother with GARCH(1,0)? The $q$ term is easier to estimate than the $p$ term (i.e. you can estimate ARCH($q$) with OLS) anyway.
Nevertheless, my understanding of the way MLE GARCH programs work is they will set the initial GARCH variance equal to either the sample variance or the expected value (that you derive for this case). Without any ARCH terms, the sample variance version would converge to the long-term one (depending on the size of $\delta$). I don't think there would be any change for the expected variance version. So, I'm not sure if you could say it is homoskedastic no matter what (it depends on how you choose the initial variance), but it likely would converge quickly to the expected value for common values of $\delta$.
Best Answer
Yes, it is. A GARCH model can be expressed (under some regularity condition) as an infinite-order ARCH model, thus making the conditional variance $\sigma_t^2$ depend on an infinite number of past squared innovations $\varepsilon_{t-i}^2$ for $i=1,2,\dots$.
Citing the original paper introducing the GARCH model (Bollerslev, 1986),
(emphasis is mine).
Under long memory, the influence of past innovations dies out gradually and slowly, and any finite-order ARCH model fails to capture this efficiently. On the other hand, the GARCH model excels at this. How? See the answer to your first questions above. Thus GARCH is more parsimonious as it uses just a couple of (or a few) parameters to achieve what the ARCH model would need an infinite number of parameters for. The argument is also very similar (essentially the same) to how an ARMA model is more parsimonious than an AR or an MA model.
References: