Solved – Testing for cointegration and building a VEC model

cointegrationrtime seriesvector-autoregressionvector-error-correction-model

I have 3 variables which are all stationary at 2nd order difference. I want to check for cointegration using the piece of code below. If I run pairwise cointegration analysis then I get these results:

VARselect(f1[2:3], lag.max=10)$selection ## optimal no of lags to be 7
coint=ca.jo(f1[2:3], ecdet="none", type="trace", K=7, spec="longrun")
summary(coint) ## indicates cointegrating relationship
Values of teststatistic and critical values of test:
          test 10pct  5pct  1pct
r <= 1 | 29.23  6.50  8.18 11.65
r = 0  | 75.18 15.66 17.95 23.52

This means that there is no cointegrating relationship between them. If I do this for other variables, f1[3:4] and f1[c(2,4)] then I get one cointegrating relationship.

VARselect is used to choose the optimal lag. For all the variables together:

VARselect(f1[2:4], lag.max=10)$selection
AIC(n)  HQ(n)  SC(n) FPE(n) 
     5      5      5      4 

coint=ca.jo(f1[2:4], ecdet="none", type="trace", K=5, spec="longrun")
summary(coint)
Values of test statistic and critical values of test:
          test 10pct  5pct  1pct
r <= 2 |  0.08  6.50  8.18 11.65
r <= 1 | 14.24 15.66 17.95 23.52
r = 0  | 39.67 28.71 31.52 37.22

Do I need to take in all variable while running a VECM?
Is VARselect the right way to choose the lag to be specified in ca.jo?

This would mean that there is cointegration between the variables and I need to run a VECM. But how do I know how many cointegrating relationships are there. As far as i have seen $r=2$ will be specified while doing a vecm
Is $r=2$ the correct way to specify a VECM?

cajools(coint)
cajorls(coint, r = 2) # or use this

Is this procedure that I am following a correct way to model?

Update 1:

For 1. I think it is up to us to determine what kind of relationship we would like to examine and then set up a model!
Ya its a iterative VAR to choose the right lag length.
Not clear: so the highest rank I can not reject would be 2 for the 3 variable case?

Update 2: Regarding 3. I was asking for the f1[2:4] where I produced the statistics. According to me there is only 1 cointegrating relationship. So $r=1$ in fitting a VECM.

Update 3:

As my variables becomes stationary at 2nd order of difference, can I perform a Johansen co-integration which works at I(1)? Or do I have to feed in the first difference of my variables in order to perform Johansen co-integration.
Also since using VARselect the optimal lag turned out to be 4. So I have to take lag=3 while running a cointegration model.

Best Answer

You seem to be doing pairwise analysis when you in fact have three variables. This way you may miss cointegrating relationships that are not pairwise but involve more variables. The standard way in modelling of cointegrated variables is to use all the variables you have if they are integrated of the same order.

Now to answer your questions,

Yes, include all three variables in VAR modelling and cointegration testing.
Yes, it is an acceptable method. You can find it used, e.g., in Pfaff (2008), p. 149 or in the vignette of "vars" package in R, p. 17.
Johansen procedure as implemented in function ca.jo will help you find the number of cointegrating vectors. Take the output of ca.jo, start with $r=0$ and see if you can reject the null hypothesis of $r=0$ using the test statistic and the critical values reported in the output. If you reject, move to $r=1$ and upwards until you cannot reject. The first rank that you cannot reject is the number of cointegrating vectors. If you can reject all of them, all of your series appear to be stationary.
In general, any modern time series textbook should include a section on cointegration testing using Johansen procedure; just follow it.

Update (for the updated OP)

Neglecting cointegration relationships beyond pairwise ones may lead to omitted variable bias -- because you would be omitting error correction terms associated with the neglected cointegrating vectors.
Cannot understand the question, sorry.
If the variables are truly integrated, you cannot have $r=m$ for $m$ being the number of time series in the system. In a three-variable case, $r=2$ is the highest rank; $r=3$ already implies the variables are not integrated to begin with.

References

Pfaff, Bernhard. Analysis of integrated and cointegrated time series with R. Springer Science & Business Media, 2008.
Pfaff, Bernhard. "VAR, SVAR and SVEC models: Implementation within R package vars." Journal of Statistical Software 27.4 (2008): 1-32.

Related Solutions

Solved – Why does ca.jo has a minimum lag order of 2

I updated the package tsDyn (version 0.9-40 submitted to CRAN), so that its VECM() function can handle your case of lag=1. Note that:

With function VECM(), use lag=0 for the case you described as lag=1
A warning will be printed, as fevd(), irf(), predict() et al are not guaranteed to work
If you want no intercept, use: include="none"

Example:

library(tsDyn)
data(barry)
summary(VECM(barry, lag=0, estim="ML"))

#############
###Model VECM 
#############
Full sample size: 324   End sample size: 323
Number of variables: 3  Number of estimated slope parameters 6
AIC -4871.5     BIC -4848.83    SSR 29.3275
Cointegrating vector (estimated by ML):
   dolcan    cpiUSA    cpiCAN
r1      1 -0.021234 0.0402079


            ECT                 Intercept         
Equation dolcan -0.0004(0.0011)     0.0024(0.0030)    
Equation cpiUSA -0.0436(0.0155)**   0.3685(0.0413)*** 
Equation cpiCAN -0.0824(0.0214)***  0.4649(0.0572)***

Solved – VECM, positive loading coefficients of EC terms

Let the two cointegrating variables be $x$ and $y$.

The error correction term is $x-by$.

Consider equation 1 in the system (equation 1 or 2 does not matter, any one of them is enough to understand).

The error correction term with its loading is $a(x-by)$.

Now consider what happens when you switch the positions of $x$ and $y$ by renormalizing the cointegration vector. You use the following: $a(x-by)=-ba(y-\frac{1}{b}x)$ and get the new error correction term with its loading $a'(y-b'x)$ where $a'=-ba$ and $b'=\frac{1}{b}$.

That means you actually expect the opposite sign, and it's not a mistake. Your reported coefficient values seem to agree with this.

Best Answer

Related Solutions

Solved – Why does ca.jo has a minimum lag order of 2

Solved – VECM, positive loading coefficients of EC terms

Related Question