Solved – VAR or VECM for a mix of stationary and nonstationary variables

cointegrationstationaritytime seriesvector-autoregressionvector-error-correction-model

I have 4 time series. One of them is stationary and rest of them are not. I need to find relation between them. I will use AIC to decide lag length.

  1. Should I use VAR or VECM to find relation between them?
  2. Will VAR or VECM give me relation in terms of equation which can be used for forecasting?
  3. Do I need to perform Johansen's test of cointegration?
  4. What good would it do?

Best Answer

So you have three nonstationary series and one stationary series. Let us call them $x_1$, $x_2$, $x_3$, and $x_4$, respectively. Suppose the nonstationarity of $x_1$, $x_2$, $x_3$ is of a unit-root kind (rather than of some other kind); that is, each of $x_1$, $x_2$, $x_3$ is integrated of order one, I(1). You can determine the order of integration using, for example, the augmented Dickey-Fuller test (ADF test).

Test each pair of the nonstationary series ($x_1$ and $x_2$; $x_1$ and $x_3$; $x_2$ and $x_3$) for cointegration using the Johansen or the Engle-Granger test.
Then test all three series ($x_1$, $x_2$, $x_3$) for cointegration using the Johansen test.
Depending on the results of the tests, you may find yourself in one of the following situations:

(A) No cointegration
(B) Two of the variables (say, $x_1$ and $x_2$) are cointegrated while the third variable (say, $x_3$) is not
(C) The three variables ($x_1$, $x_2$, $x_3$) are cointegrated

In general, you want the following:

  • Models for cointegrated variables should have an error-correction representation; otherwise the model would be misspecified (cointegration goes hand-in-hand with the error correction representation).
  • Models for stationary dependent variables should not have nonstationary explanatory variables (except perhaps for stationary combinations of cointegrated nonstationary variables); otherwise the linear combination of the regressors would diverge from the regressand.
  • Models for nonstationary dependent variables should have at least one nonstationary explanatory variable; otherwise the regressand would diverge from the linear combination of the regressors. Mind nonstandard distributions of estimators for the integrated variables.

Based on these principles, you may do the following:

If (A) then first-difference each of the three variables ($x_1$, $x_2$, $x_3$), and use them together with the stationary variable $x_4$ to build a VAR model.

If (B) then build a model where

  • $\Delta x_1$ depends on the error correction term and lags of $\Delta x_1$, $\Delta x_2$, $\Delta x_3$, $x_4$;
  • $\Delta x_2$ depends on the error correction term and lags of $\Delta x_1$, $\Delta x_2$, $\Delta x_3$, $x_4$;
  • $\Delta x_3$ depends on lags of $\Delta x_1$, $\Delta x_2$, $\Delta x_3$, $x_4$;
  • $x_4$ depends on the error correction term and lags of $\Delta x_1$, $\Delta x_2$, $\Delta x_3$, $x_4$.

If (C) then build a model where

  • $\Delta x_1$ depends on the error correction term and lags of $\Delta x_1$, $\Delta x_2$, $\Delta x_3$, $x_4$;
  • $\Delta x_2$ depends on the error correction term and lags of $\Delta x_1$, $\Delta x_2$, $\Delta x_3$, $x_4$;
  • $\Delta x_3$ depends on the error correction term and lags of $\Delta x_1$, $\Delta x_2$, $\Delta x_3$, $x_4$;
  • $x_4$ depends on the error correction term and lags of $\Delta x_1$, $\Delta x_2$, $\Delta x_3$, $x_4$.

These are pretty general models with lots of regressors. You may find it beneficial to exclude some variables from some equations or use penalization to avoid overfitting.