Solved – Interpretation of VAR and causality

cointegrationgranger-causalitystationaritytime seriesvector-autoregression

I have two time series(X1 and X2) each having 900 records. I wanted to establish relationship between them and put it in equation. I did following things:

1) Checked for correlation and it came out as 0.80. As I wanted to build robust model, I read further and came to know that correlation is not the right way to find relation between time series as there might be the case of spurious regression.

2) Then I tested both time series for stationarity and I got below p-values.

Time series       X1      X2   
ADF test        0.28    0.07    
KPSS test       0.01    0.01                               

That means I can safely conclude that there is no unit root and both series are stationary.

3) Then I checked for lag length using VARselect and I got below results

VARselect(mydata, lag.max=8, type="const")                      
$selection                      
AIC(n)  HQ(n)  SC(n) FPE(n)                         
     4      2      1      4                         

$criteria                       
                  1            2            3            4            5                     
AIC(n)     9.536526     9.518359     9.517730     9.514627     9.515910                     
HQ(n)      9.547980     9.537448     9.544454     9.548987     9.557905                     
SC(n)      9.566621     9.568517     9.587951     9.604911     9.626258                     
FPE(n) 13856.729981 13607.265067 13598.708504 13556.586100 13574.003525                     
                  6            7            8                       
AIC(n)     9.518960     9.522675     9.528970                       
HQ(n)      9.568591     9.579942     9.593872                       
SC(n)      9.649371     9.673149     9.699507                       
FPE(n) 13615.481638 13666.186272 13752.515062

I guess that means I should choose 1 as lag length since AIC(n) is highest for 1. Please correct me if I am wrong. (Data I have is daily for last 3 years.)

4) After performing Johansen's test,

###################### 
# Johansen-Procedure # 
###################### 

Test type: maximal eigenvalue statistic (lambda max) , without linear trend and constant in cointegration 

Eigenvalues (lambda):
[1]  4.868739e-02  8.650614e-03 -2.834784e-19

Values of teststatistic and critical values of test:

          test 10pct  5pct  1pct
r <= 1 |  8.51  7.52  9.24 12.97
r = 0  | 48.86 13.75 15.67 20.20

Eigenvectors, normalised to first column:
(These are the cointegration relations)

               X1.l2        X2.l2     constant
X1.l2       1.000000    1.0000000    1.0000000
X2.l2      -1.043634    0.2793248   -0.1931227
constant   35.516701 -917.4329825 -168.0889421

Weights W:
(This is the loading matrix)

             X1.l2        X2.l2      constant
X1.d   -0.03565412 -0.003991993  8.008639e-18
X2.d    0.02731113 -0.015601292 -1.841754e-18

I guess, it means with 90% confidence, we can say both series are cointegrated at levels.

5) Then I ran Granger test to find out interdependence

grangertest(mydata, order=4)    
Granger causality test  

Model 1: X2 ~ Lags(X2, 1:4) + Lags(X1, 1:4) 
Model 2: X2 ~ Lags(X2, 1:4) 
  Res.Df Df     F Pr(>F)    
1    968                    
2    972 -4 1.273 0.2788    

and I don't know how to interpret this result.

6) Then I ran VAR since both series are stationary and cointegrated at level.

> myvar <- VAR(mydata, p=3, type="const")               
> summary(myvar)                

VAR Estimation Results:             
=========================               
Endogenous variables: X1, X2                
Deterministic variables: const              
Sample size: 978                
Log Likelihood: -7412.001               
Roots of the characteristic polynomial:             
0.9838 0.9403 0.3227 0.256 0.1759 0.1286                
Call:               
VAR(y = mydata, p = 3, type = "const")              


Estimation results for equation X1:                 
=====================================               
X1 = X1.l1 + X2.l1 + X1.l2 + X2.l2 + X1.l3 + X2.l3 + const              

        Estimate Std. Error t value Pr(>|t|)                    
X1.l1    0.96679    0.03226  29.973  < 2e-16 ***                
X2.l1    0.12106    0.01672   7.241 9.07e-13 ***                
X1.l2    0.04524    0.04475   1.011   0.3123                    
X2.l2   -0.05777    0.02337  -2.472   0.0136 *                  
X1.l3   -0.04889    0.03096  -1.579   0.1147                    
X2.l3   -0.03031    0.01731  -1.751   0.0803 .                  
const    2.63598    2.73796   0.963   0.3359                    
---             
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1              


Residual standard error: 7.755 on 971 degrees of freedom                
Multiple R-Squared: 0.9863,     Adjusted R-squared: 0.9862              
F-statistic: 1.166e+04 on 6 and 971 DF,  p-value: < 2.2e-16                 


Estimation results for equation X2:                 
========================================                
X2 = X1.l1 + X2.l1 + X1.l2 + X2.l2 + X1.l3 + X2.l3 + const              

        Estimate Std. Error t value Pr(>|t|)                    
X1.l1    0.02762    0.06228   0.443  0.65753                    
X2.l1    0.97667    0.03228  30.252  < 2e-16 ***                
X1.l2    0.02181    0.08641   0.252  0.80079                    
X2.l2    0.04168    0.04513   0.924  0.35597                    
X1.l3   -0.03310    0.05979  -0.554  0.57997                    
X2.l3   -0.05587    0.03343  -1.671  0.09497 .                  
const   15.32998    5.28683   2.900  0.00382 **                 
---             
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1              


Residual standard error: 14.98 on 971 degrees of freedom                
Multiple R-Squared: 0.9564,     Adjusted R-squared: 0.9561              
F-statistic:  3549 on 6 and 971 DF,  p-value: < 2.2e-16                 

Covariance matrix of residuals:

        X1      X2              
X1   60.15   13.46              
X2   13.46  224.26              

Correlation matrix of residuals:

         X1      X2             
X1   1.0000  0.1159             
X2   0.1159  1.0000             

Questions:

  1. Is my approach correct?
  2. How should I interpret Granger and VAR results?
  3. With this approach, how can I put X1 and X2 in equation term?
  4. Please let me know if I misinterpreted or missed anything.

Thanks for your time and reading.

Best Answer

Regarding (2) stationarity/unit-root testing: you say That means I can safely conclude that there is no unit root and both series are stationary. The conclusion should be the opposite: you reject the H0 for KPSS test and you cannot reject the H0 for ADF test. Both results indicate the presence of a unit root.

Regarding (3), the third and fourth lines of the code piece indicates that the best model according to AIC is with 4 lags. You would choose the model that minimizes the AIC, not maximizes it.

Regarding (4), you find cointegration, although the result is not very certain because the test statistic for r <= 1 falls in between the 5% and 10% critical values.

Regarding (5), restricting X2 to depend only on its own lags but not on the lags of X1 does not cause the model likelihood to drop very much, so you get an insignificant $F$ statistic. That is, you do not reject the null hypothesis that X1 does not Granger-cause X2. I cannot comment on whether your approach is legitimate (due to the variables being both integrated and cointegrated). However, Dave Giles has some excellent posts about Granger causality in his blog, e.g. here and here. After reading them, you should be able to understand it quite well.

Regarding (6), since the series are not stationary, running VAR is not a good idea. Run a VECM instead.

I hope my notes largely answer your questions 1. through 4. I did not write out model equations for you, but I may comment after you try putting them together by yourself. (It could be beneficial to try by yourself first.)

Related Question