Solved – Intuition behind the characteristic equation of an AR or MA process

autoregressivemoving averagepolynomialtime series

Ok, so I've just started learning Time Series Analysis.

We can write an $MA(q)$ process as $Y_t = \theta(L) \varepsilon_t$
and an $AR(p)$ process as $\varepsilon_t = \phi(L) Y_t$
in terms of the lag operator.

Then, with no explanation (from my textbook), we suddenly replace $L$ with $z$ to get $\theta(z)$ or $\phi(z)$, which can "be thought of as a generating function for the coefficients" – which I don't really understand.

We then set the characteristic polynomial equal to zero, and solve for the roots. And by some magic, the complex roots of this "characteristic equation" tell us whether or not the process is stationary?

I've searched around, but I can't find an explanation or a derivation for this result. Do I need to brush up on differential equations in order to understand this on an intuitive level?

Any help would be much appreciated!

Thanks!

Best Answer

When trying to get an intuitive understanding of formal mathematical models, it is usually best to start with a simple model and then generalise later. So, with that in mind, let's start with an AR$(1)$ model with zero mean based on a white-noise series $\varepsilon_i \sim \text{IID N}(0,1)$. This model can be written in scalar form as:

$$Y_t = \phi Y_{t-1} + \sigma \varepsilon_t.$$

Now, you can substitute in this auto-regression to get $Y_t$ in terms of earlier and earlier terms:

$$\begin{equation} \begin{aligned} Y_t &= \phi Y_{t-1} + \sigma \varepsilon_t \\[6pt] &= \phi (\phi Y_{t-2} + \sigma \varepsilon_{t-1}) + \sigma \varepsilon_t \\[6pt] &= \phi^2 Y_{t-2} + \sigma (\varepsilon_t + \phi \varepsilon_{t-1}) \\[6pt] &= \phi^2 (\phi Y_{t-3} + \sigma \varepsilon_{t-2}) + \sigma (\varepsilon_t + \phi \varepsilon_{t-1}) \\[6pt] &= \phi^3 Y_{t-3} + \sigma (\varepsilon_t + \phi \varepsilon_{t-1} + \phi^2 \varepsilon_{t-2}) \\[6pt] &= \phi^3 (\phi Y_{t-4} + \sigma \varepsilon_{t-3}) + \sigma (\varepsilon_t + \phi \varepsilon_{t-1} + \phi^2 \varepsilon_{t-2}) \\[6pt] &= \phi^4 Y_{t-4} + \sigma (\varepsilon_t + \phi \varepsilon_{t-1} + \phi^2 \varepsilon_{t-2} + \phi^3 \varepsilon_{t-3}) \\[6pt] &= \cdots \\[6pt] &= \phi^k Y_{t-k} + \sigma \sum_{i=0}^{k-1} \phi^i \varepsilon_{t-i}. \end{aligned} \end{equation}$$

If $|\phi| < 1$ then this first term vanishes as $k \rightarrow \infty$ and then you have the MA$(\infty)$ representation:

$$Y_t = \sigma \sum_{i=0}^\infty \phi^i \varepsilon_{t-i}.$$

This shows you that if $|\phi| < 1$ then you can write an AR$(1)$ process as an MA$(\infty)$ process. The infinite sum in this expression is called a generating function, and in this representation it allows you to find the distribution of the observable series of values.


Using the characteristic polynomial: Rather than doing all this in scalar form, the model can be written using the lag-operator $L$ as:

$$\phi(L) Y_t = \sigma \varepsilon_t,$$

where $\phi(L) = 1 - \phi L$ is the auto-regressive characteristic polynomial (which in this case is an affine function). Now, it turns out that this polynomial function can be inverted in the same way as a polynomial function involving a real or complex number (as opposed to the lag operator). That is, if $|\phi| < 1$ then the polynomial follows the inversion rule for an infinite geometric sum:

$$\phi^{-1}(L) = \frac{1}{1-\phi L} = \sum_{i=0}^\infty \phi^i L^i.$$

Applying this to the process you get the MA$(\infty)$ representation we derived in scalar form before:

$$Y_t = \sigma \phi^{-1}(L) \varepsilon_t = \sigma \sum_{i=0}^\infty \phi^i L^i \varepsilon_t = \sigma \sum_{i=0}^\infty \phi^i \varepsilon_{t-i}.$$

You can see from the above that it is possible to deal with time-series models via scalar methods, without using the lag operator at all. By introducing the lag operator, and polynomial functions of this operator, certain calculations (like the above inversion) become much simpler. In order to verify that these are allowable, mathematicians have to appeal to the theory of functions and operators to establish that polynomials involving the lag function behave like polynomials involving real/complex numbers. Once they have established that, this allows them to use polynomials involving the lag operator to simplify changes of form in time-series models.