Time Series – How to Generate Correlated ARMA(1,1) Models

armacross correlationtime series

I am interested in generating two ARMA(1,1) time series with a pre-defined cross-correlation p between x(t) and y(t).

If you could please provide the mathematical framework and the code in R for generating it would be perfect!

I have looked at the question Generate two correlated ARMA(1,1) processes which is not a duplicate, since that question (despite the title) looks for a given correlation between the errors, while I am looking for a given correlation between the series themselves.

Best Answer

I know that you want the correlation between the series themselves but a valid way (and to me the most natural way) to achieve it is by choosing the correlation between the disturbance terms. As I explain below, you can follow the idea given here, here, and here. All you need is the relationship between the parameters of the ARMA(1,1) models and the desired correlation between the series.

Let's define the data generating process as follows:

\begin{equation} x_t = \phi x_{t-1} + \epsilon_t + \theta \epsilon_{t-1} \,, \quad \epsilon_t \sim \hbox{NID}(0, \sigma^2_\epsilon) \\ y_t = \alpha y_{t-1} + \eta_t + \beta \eta_{t-1} \,, \quad \eta \sim \hbox{NID}(0, \sigma^2_\eta) \\ \hbox{Cov}(\epsilon_t, \eta_s) = \sigma \, \hbox{ if } t=s \hbox{ and zero otherwise.} \end{equation}

It can be checked that the relationship between the parameters of the ARMA(1,1) processes $x_t$ and $y_t$ and their correlation is given by:

\begin{equation} \rho = \frac{\gamma}{\sigma_x\sigma_y} \,, \hbox{ where } \gamma = \frac{\sigma (\phi\beta + 1 + \alpha\theta + \beta\theta)}{1 - \phi\alpha} \, \hbox{ and } \\ \sigma_x = \sqrt{\sigma^2_\epsilon\frac{1 + \theta^2 + 2\phi\theta}{1 - \phi^2}} \,, \sigma_y = \sqrt{\sigma^2_\eta\frac{1 + \beta^2 + 2\alpha\beta}{1 - \alpha^2}} \,. \end{equation}

Given the AR and MA coefficients and the variances of the disturbance terms, you can easily get the value of $\hbox{Cov}(\epsilon_t,\eta_t)=\sigma$ that involves a desired correlation between the series, $\rho^*$, as follows:

\begin{equation} \sigma = \rho^* \sigma_x \sigma_y (1 - \phi\alpha) / (\phi\beta + 1 + \alpha\theta + \beta\theta) \,. \end{equation}

Below I show an example to test this relationship between the parameters. Given the parameters defined below and a desired correlation between the series equal to 0.2, several series are generated from an ARMA(1,1) with correlated disturbances. The correlation between each pair of series is stored in object rho.

require("mvtnorm")
# parameters of the data generating process
desired.rho <- 0.2
Sigma <- diag(c(2,3))    
phi <- 0.7
theta <- 0.3
alpha <- -0.4
beta <- 0.2
# implied value of the covariance between the disturbance terms
sdx <- sqrt(Sigma[1,1] * (1 + theta^2 + 2*phi*theta) / (1 - phi^2))
sdy <- sqrt(Sigma[2,2] * (1 + beta^2 + 2*alpha*beta) / (1 - alpha^2))
scov <- desired.rho * sdx * sdy * (1 - phi*alpha) / (phi*beta + 1 + alpha*theta + beta*theta)
scov
# [1] 1.022579
Sigma[1,2] <- Sigma[2,1] <- scov
n <- 200
niter <- 10000
rhos <- rep(NA, niter)
set.seed(123)
for (i in seq_len(niter))
{
  eps <- mvtnorm::rmvnorm(n = n, mean = rep(0, 2), sigma = Sigma)
  x1 <- arima.sim(n = n, model = list(ar = phi, ma = theta), innov = eps[,1])
  x2 <- arima.sim(n = n, model = list(ar = alpha, ma = beta), innov = eps[,2])  
  rhos[i] <- cor(x1, x2)
}
summary(rhos)
#     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
# -0.02344  0.16380  0.19970  0.19950  0.23590  0.39460

We can see that the correlation between the series (not between the errors) is on average close to the value 0.2, as desired in this exercise. This correlation implies a covariance between the errors equal to 1.022579.

Edit

This is the derivation of the elements in the expression above. This make use of the following facts: the process is stationary; $x_t$ depends on past values of $\epsilon_t$ but not on future values of $\epsilon_t$; $\epsilon_t$ is independently distributed (same for $\eta_t$) and the covariance between concurrent values of $\epsilon_t$ and $\eta_t$ is $\sigma$.

\begin{eqnarray} \gamma &=& \hbox{E} \left[ (\phi x_{t-1} + \epsilon_t + \theta \epsilon_{t-1})(\alpha y_{t-1} + \eta_t + \beta \eta_{t-1}) \right] = \\ \gamma &=& \hbox{E}\left[ \phi\alpha x_{t-1}y_{t-1} \right] + \hbox{E}\left[ \phi x_{t-1}\eta_t \right] + \hbox{E}\left[ \phi\beta\eta_{t-1} \right] + \hbox{E}\left[ \alpha\epsilon_t y_{t-1} \right] + \hbox{E}\left[ \epsilon_t\eta_t \right] + \\ &+& \hbox{E}\left[ \beta\epsilon_t\eta_{t-1} \right] + \hbox{E}\left[ \alpha\theta \epsilon_{t-1}y_{t-1} \right] + \hbox{E}\left[ \theta\epsilon_{t-1}\eta_t\right] + \hbox{E}\left[ \beta\theta\epsilon_{t-1}\eta_{t-1} \right] \\ \gamma &=& \phi\alpha\gamma + 0 + \phi\beta\sigma + 0 + \sigma + 0 + \alpha\theta\sigma + 0 + \beta\theta\sigma \\ (1 - \phi\alpha)\gamma &=& \sigma (\phi\beta + 1 + \alpha\theta + \beta\theta) \\ \gamma &=& \frac{\sigma (\phi\beta + 1 + \alpha\theta + \beta\theta)}{(1 - \phi\alpha)} \end{eqnarray}

\begin{eqnarray} \hbox{Var}(x_t) = \sigma^2_x &=& \hbox{E}\left[ (\phi x_{t-1} + \epsilon_t + \theta \epsilon_{t-1})^2 \right] \\ \sigma^2_x &=& \phi^2 \hbox{E}\left[ x_{t-1}^2 \right] + 2\phi\hbox{E}\left[ x_{t-1}\epsilon_t \right] + 2\phi\theta\hbox{E}\left[ x_{t-1}\epsilon_{t-1} \right] + \hbox{E}\left[ \epsilon_t^2 \right] + \\ &+& 2\theta \hbox{E}\left[ \epsilon_t\epsilon_{t-1} \right] + \theta^2 \hbox{E}\left[ \epsilon_{t-1}^2 \right] \\ \sigma^2_x &=& \phi^2 \sigma^2_x + 0 + 2\phi\theta\sigma^2_\epsilon + \sigma^2_\epsilon + 0 + \theta^2\sigma^2_\epsilon \\ (1 - \phi^2) \sigma^2_x &=& \sigma^2_\epsilon (1 + \theta^2 + 2\phi\theta) \\ \sigma^2_x &=& \frac{\sigma^2_\epsilon (1 + \theta^2 + 2\phi\theta)}{(1 - \phi^2)} \end{eqnarray}

Related Solutions

Time Series – How to Generate Two Correlated ARMA(1,1) Processes

The following R code will do what you want. I have used $\rho_{12}=0.5$, $\mu=10$, $\phi_1=0.2$, $\phi_2=0.8$, $\theta_1=0.3$, $\theta_2=-0.7$ and I have assumed that the errors have mean zero and variance 1.

library(mvtnorm)
rho <- 0.5
mu <- c(10,10)
phi <- c(0.2,0.8)
theta <- c(0.3,-0.7)
d <- ts(matrix(0,ncol=2,nrow=50))
e <- ts(rmvnorm(50,sigma=cbind(c(1,rho),c(rho,1))))
for(i in 2:50)
  d[i,] <- mu + phi*d[i-1,] - theta*(e[i-1,]+e[i,])
plot(d)

Solved – Fitted values of ARMA model

To answer your questions, you basically need to know how the residuals i.e. $e_t$ are calculated in an arma model. Because then $\hat{X_{t}}=X_{t}-e_{t}$. Let's first generate a fake data ($X_t$) from arima(.5,.6) and fit the arma model (without mean):

library(forecast)
n=1000
ts_AR <- arima.sim(n = n, list(ar = 0.5,ma=0.6))
f=arima(ts_AR,order=c(1,0,1),include.mean=FALSE)
summary(f)
    Series: ts_AR 
    ARIMA(1,0,1) with zero mean     

    Coefficients:
             ar1     ma1
          0.4879  0.5595
    s.e.  0.0335  0.0317

    sigma^2 estimated as 1.014:  log likelihood=-1426.7
    AIC=2859.4   AICc=2859.42   BIC=2874.12

    Training set error measures:
                         ME    RMSE       MAE      MPE     MAPE      MASE
    Training set 0.02102758 1.00722 0.8057205 40.05802 160.1078 0.6313145

Now I create the residuals as follows: $e_1=0$ (since there is no residual at 1) and for $t=2,...,n$ we have: $e_t=X_t-Ar*X_{t-1}-Ma*e_{t-1}$, where $Ar$ and $Ma$ are the estimated auto-regressive and moving average part in above fitted model. Here is the code:

e = rep(1,n)
e[1] = 0 ##since there is no residual at 1, e1 = 0
for (t in (2 : n)){
  e[t] = ts_AR[t]-coef(f)[1]*ts_AR[t-1]-coef(f)[2]*e[t-1]
}

Once you find the residuals $e_{t}$, the fitted values are just $\hat{X_{t}}=X_{t}-e_{t}$. So in the following, I compared the first 10 fitted values obtained from R and the ones I can calculate from $e_{t}$ I created above (i.e. manually).

cbind(fitted.from.package=fitted(f)[1:10],fitted.calculated.manually=ts_AR[1:10]-e[1:10])
      fitted.from.package fitted.calculated.manually
 [1,]          -0.4193068                 -1.1653515
 [2,]          -0.8395447                 -0.5685977
 [3,]          -0.4386956                 -0.6051324
 [4,]           0.3594109                  0.4403898
 [5,]           2.9358336                  2.9013738
 [6,]           1.3489537                  1.3682191
 [7,]           0.5329436                  0.5219576
 [8,]           1.0221220                  1.0283511
 [9,]           0.6083310                  0.6048668
[10,]          -0.5371484                 -0.5352324

As you see there are close but not exactly the same. The reason is that when I created the residuals I set $e_{1}=0$. There are other choices though. For example based on the help file to arima, the residuals and their variance found by a Kalman filter and therefore their calculation of $e_t$ will be slightly different from me. But as time goes on they are converging.
Now for the Ar(1) model. I fitted the model (without mean) and directly show you how to calculate the fitted values using the coefficients. This time I didn't calculate the residuals. Note that I reported the first 10 fitted values removing the first one (as again it would be different depending on how you define it). As you can see, they are completely the same.

f=arima(ts_AR,order=c(1,0,0),include.mean=FALSE)
cbind(fitted.from.package=fitted(f)[2:10],fitted.calculated.manually=coef(f)*ts_AR[1:9])
      fitted.from.package fitted.calculated.manually
 [1,]          -0.8356307                 -0.8356307
 [2,]          -0.6320580                 -0.6320580
 [3,]           0.0696877                  0.0696877
 [4,]           2.1549019                  2.1549019
 [5,]           2.0480074                  2.0480074
 [6,]           0.8814094                  0.8814094
 [7,]           0.9039184                  0.9039184
 [8,]           0.8079823                  0.8079823
 [9,]          -0.1347165                 -0.1347165

Best Answer

Related Solutions

Time Series – How to Generate Two Correlated ARMA(1,1) Processes

Solved – Fitted values of ARMA model

Related Question