Time Series – State Space Representation of ARMA(p,q) Models from Hamilton

arimakalman filterstate-space-modelstime series

I have been reading Hamilton Chapter 13 and he has the following state space representation for an ARMA(p,q). Let $r = \max(p,q+1)$.Then the ARMA (p,q) process is as follows:
$$
\begin{aligned}
y_t -\mu &= \phi_1(y_{t-1} -\mu) + \phi_2(y_{t-2} -\mu) + … + \phi_3(y_{t-3} -\mu) \\
&+ \epsilon_t + \theta_1\epsilon_{t-1} + … + \theta_{r-1}\epsilon_{t-r+1}.
\end{aligned}
$$
Then, he defines the State Equation as follows:

$$ \xi_{t+1} = \begin{bmatrix}
\phi_1 & \phi_2 & \dots & \phi_{r-1} & \phi_r \\
1 & 0 & \dots & 0 & 0 \\
\vdots & \vdots & \dots & 0 & 0 \\
0 & 0 & \dots &1 &0 \end{bmatrix} \xi_t + \begin{bmatrix}
\epsilon_{t+1} \\
0\\\vdots \\
0
\end{bmatrix} $$

and the observation equation as:

$$y_t = \mu + \begin{bmatrix}
1 & \theta_1 & \theta_2 & \dots & \theta_{r-1} \\
\end{bmatrix} \xi_t. $$

I do not understand what the $\xi_t$ is in this case. Because in his AR(p) representation it is $\begin{bmatrix}
y_{t} – \mu \\
y_{t-1} – \mu\\\vdots \\
y_{t-p+1} – \mu
\end{bmatrix} $ and in his MA (1) representation it is $\begin{bmatrix}
\epsilon_{t} \\
\epsilon_{t-1}
\end{bmatrix} $.

Could someone explain this to me a little bit better?

Best Answer

Hamilton shows that this is a correct representation in the book, but the approach may seem a bit counterintuitive. Let me therefore first give a high-level answer that motivates his modeling choice and then elaborate a bit on his derivation.

Motivation:

As should become clear from reading Chapter 13, there are many ways to write a dynamic model in state space form. We should therefore ask why Hamilton chose this particular representation. The reason is that that this representation keeps the dimensionality of the state vector low. Intuitively, you would think (or at least I would) that the state vector for an ARMA($p$,$q$) needs to be at least of dimension $p+q$. After all, just from observing say $y_{t-1}$, we cannot infer the value of $\epsilon_{t-1}$. Yet he shows that we can define the state-space representation in a clever way that leaves the state vector of dimension of at most $r = \max\{p, q + 1 \}$. Keeping the state dimensionality low may be important for the computational implementation, I guess. It turns out that his state-space representation also offers a nice interpretation of an ARMA process: the unobserved state is an AR($p$), while the MA($q$) part arises due to measurement error.

Derivation:

Now for the derivation. First note that, using lag operator notation, the ARMA(p,q) is defined as: $$ (1-\phi_1L - \ldots - \phi_rL^r)(y_t - \mu) =(1 + \theta_1L + \ldots + \theta_{r-1}L^{r-1})\epsilon_t $$ where we let $\phi_j = 0$ for $j>p$, and $\theta_j = 0$ for $j>q$ and we omit $\theta_r$ since $r$ is at least $q+1$. So all we need to show is that his state and observation equations imply the equation above. Let the state vector be $$ \mathbf{\xi}_t = \{\xi_{1,t}, \xi_{2,t},\ldots,\xi_{r,t}\}^\top $$ Now look at the state equation. You can check that equations $2$ to $r$ simply move the entries $\xi_{i,t}$ to $\xi_{i-1,t+1}$ one period ahead and discard $\xi_{r,t}$ in the state vector at $t+1$. The first equation, defining $\xi_{i,t+1}$ is therefore the relevant one. Writing it out: $$ \xi_{1,t+1} = \phi_1 \xi_{1,t} + \phi_2 \xi_{2,t} + \ldots + \phi_r \xi_{r,t} + \epsilon_{t+1} $$ Since the second element of $\mathbf{\xi_{t}}$ is the first element of $\mathbf{\xi_{t-1}}$ and the third element of the $\mathbf{\xi_{t}}$ is the first element of $\mathbf{\xi_{t-2}}$ and so on, we can rewrite this, using lag operator notation and moving the lag polynomial to the left hand side (equation 13.1.24 in H.): $$ (1-\phi_1L - \ldots - \phi_rL^r)\xi_{1,t+1} = \epsilon_{t+1} $$ So the hidden state follows an autoregressive process. Similarly, the observation equation is $$ y_t = \mu + \xi_{1,t} + \theta_1\xi_{2,t} + \ldots + \theta_{r-1}\xi_{r-1,t} $$ or $$ y_t - \mu = (1 + \theta_1L + \ldots + \theta_{r-1}L^{r-1})\xi_{1,t} $$ This does not look much like an ARMA so far, but now comes the nice part: multiply the last equation by $(1-\phi_1L - \ldots - \phi_rL^r)$: $$ (1-\phi_1L - \ldots - \phi_rL^r)(y_t - \mu) = (1 + \theta_1L + \ldots + \theta_{r-1}L^{r-1})(1-\phi_1L - \ldots - \phi_rL^r)y_t $$ But from the state equation (lagged by one period), we have $(1-\phi_1L - \ldots - \phi_rL^r)\xi_{1,t} = \epsilon_{t}$! So the above is equivalent to $$ (1-\phi_1L - \ldots - \phi_rL^r)(y_t - \mu) = (1 + \theta_1L + \ldots + \theta_{r-1}L^{r-1})\epsilon_{t} $$ which is exactly what we needed to show! So the state-observation system correctly represents the ARMA(p,q). I was really just paraphrasing Hamilton, but I hope that this is useful anyway.

Related Question