Lag Operator and ARMA as a MA$(\infty)$ Process: A Detailed Study

arimalagstime series

Let's take an AR(1) model. I am comfortable with the fact that $Ly_t$ means that the lag operator operates on the process ${y_t}$ by lagging it by one period. What I am a bit less comfortable with to write a lag operator by itself without a process to operate on. In the denominator we have $1-a_1L$. It isn't immediately clear which process the lag operator is operating on, but upon a moment's reflection, I guess we realize we can multiply both sides by the denominator and I guess we can figure out that the lag is operating on $y_t$?

$$y_t=a_0+a_1Ly_t+\epsilon_t$$

$$y_t=\frac{a_0+\epsilon_t}{1-a_1L}$$

Now let's switch to the ARMA(p,q) model.

$$y_t=a_0+\sum_{i=1}^{p}a_iy_{t-i}+\sum_{i=0}^{q}\beta_i\epsilon_{t-i}$$
$$y_t-\sum_{i=1}^{p}a_iy_{t-i}=a_0+\sum_{i=0}^{q}\beta_i\epsilon_{t-i}$$
$$(1-\sum_{i=1}^{p}a_iL^{i})y_t=a_0+\sum_{i=0}^{q}\beta_i\epsilon_{t-i}$$

Because I am struggling to wrap my mind around the lag operator I cannot quite convince myself of what Enders says (page 51) "The important point to recognize is that the expansion (of the equation below) will yield an $MA(\infty)$ process". I am looking for someone to help me see that the expansion of $y_t$ expression below will yield an $MA(\infty)$ process and help me understand what it means for a lag operator to be written by itself, without something to operate on. What is in the denominator of $y_t$? How do I think about it?

$$y_t=\frac{a_0+\sum_{i=0}^{q}\beta_i\epsilon_{t-i}}{1-\sum_{i=1}^{p}a_iL^{i}}$$

Best Answer

Hi ColorStatistics: This isn't a mathematical proof but you can use the lag operator as if it was a number. (this is proven in functional analysis and there's a proof if ot somewhere on the net that I'll look for after I write this).

So, suppose you had $\frac{1}{1 - \rho}$ and $\rho$ was less than 1.0.

Then, using the formula for infinite geometric series, the expression can be re-written as

$1 + \rho + \rho^2 + \rho^3 + \ldots \rho^{\infty}$ and converges since $\rho \lt 1.0$.

The proof of above is more straightforward than the proof that one can use L as a number and should be in any calculus or pre-calc text.

Now, suppose you instead have an expression $\frac{\epsilon_t}{(1 - \rho L)}$ and $\rho L$ was less than 1.0 and L denotes the lag operator.

Then, you can think of $\rho L$ as the number that is acting on $\epsilon_t$.

So, since $\frac{1}{(1 - \rho L)} = $

$1 + \rho L + (\rho L)^2 + (\rho L) ^3 + \ldots (\rho L)^{\infty}$,

one can just multiply this by $\epsilon_t$ which results in

$\epsilon_t + \rho \epsilon_{t-1} + \rho^2 \epsilon_{t-2} + \ldots + \rho^{\infty} \epsilon_{t-\infty}$

So, the expression above is the $MA(\infty)$ which is equivalent to the AR(1).

Another more intuitive way to think about is the following:

The AR(1) is : $y_{t} = \rho y_{t-1} + \epsilon_t$.

So, the model is such that the current response is always a combination of some fraction, $\rho$ of the previous response + a new error term.

Note that, if one starts from the very, very beginning of the process and $y_{0} = 0$, then really what you are doing is taking an exponentially smoothed value of the error term where the exponentially smoothed weights get less and less for values of $\epsilon_{t-i}$ that are further in the past. So, the $MA(\infty)$ representation is saying: "Forget about the previous response. What one is really doing in the AR(1) is taking new responses ( the $\epsilon_{t-i})$ and exponentially smoothing them over time with decreasing weights.

Note that in the case of the $AR(p)$, the argument is more complex but the concept is the same. If I find the paper, I'll let you know in a comment.

Related Question