I have two time series(X1
and X2
) each having 900 records. I wanted to establish relationship between them and put it in equation. I did following things:
1) Checked for correlation and it came out as 0.80. As I wanted to build robust model, I read further and came to know that correlation is not the right way to find relation between time series as there might be the case of spurious regression.
2) Then I tested both time series for stationarity and I got below p-values.
Time series X1 X2
ADF test 0.28 0.07
KPSS test 0.01 0.01
That means I can safely conclude that there is no unit root and both series are stationary.
3) Then I checked for lag length using VARselect
and I got below results
VARselect(mydata, lag.max=8, type="const")
$selection
AIC(n) HQ(n) SC(n) FPE(n)
4 2 1 4
$criteria
1 2 3 4 5
AIC(n) 9.536526 9.518359 9.517730 9.514627 9.515910
HQ(n) 9.547980 9.537448 9.544454 9.548987 9.557905
SC(n) 9.566621 9.568517 9.587951 9.604911 9.626258
FPE(n) 13856.729981 13607.265067 13598.708504 13556.586100 13574.003525
6 7 8
AIC(n) 9.518960 9.522675 9.528970
HQ(n) 9.568591 9.579942 9.593872
SC(n) 9.649371 9.673149 9.699507
FPE(n) 13615.481638 13666.186272 13752.515062
I guess that means I should choose 1
as lag length since AIC(n)
is highest for 1
. Please correct me if I am wrong. (Data I have is daily for last 3 years.)
4) After performing Johansen's test,
######################
# Johansen-Procedure #
######################
Test type: maximal eigenvalue statistic (lambda max) , without linear trend and constant in cointegration
Eigenvalues (lambda):
[1] 4.868739e-02 8.650614e-03 -2.834784e-19
Values of teststatistic and critical values of test:
test 10pct 5pct 1pct
r <= 1 | 8.51 7.52 9.24 12.97
r = 0 | 48.86 13.75 15.67 20.20
Eigenvectors, normalised to first column:
(These are the cointegration relations)
X1.l2 X2.l2 constant
X1.l2 1.000000 1.0000000 1.0000000
X2.l2 -1.043634 0.2793248 -0.1931227
constant 35.516701 -917.4329825 -168.0889421
Weights W:
(This is the loading matrix)
X1.l2 X2.l2 constant
X1.d -0.03565412 -0.003991993 8.008639e-18
X2.d 0.02731113 -0.015601292 -1.841754e-18
I guess, it means with 90% confidence, we can say both series are cointegrated at levels.
5) Then I ran Granger test to find out interdependence
grangertest(mydata, order=4)
Granger causality test
Model 1: X2 ~ Lags(X2, 1:4) + Lags(X1, 1:4)
Model 2: X2 ~ Lags(X2, 1:4)
Res.Df Df F Pr(>F)
1 968
2 972 -4 1.273 0.2788
and I don't know how to interpret this result.
6) Then I ran VAR since both series are stationary and cointegrated at level.
> myvar <- VAR(mydata, p=3, type="const")
> summary(myvar)
VAR Estimation Results:
=========================
Endogenous variables: X1, X2
Deterministic variables: const
Sample size: 978
Log Likelihood: -7412.001
Roots of the characteristic polynomial:
0.9838 0.9403 0.3227 0.256 0.1759 0.1286
Call:
VAR(y = mydata, p = 3, type = "const")
Estimation results for equation X1:
=====================================
X1 = X1.l1 + X2.l1 + X1.l2 + X2.l2 + X1.l3 + X2.l3 + const
Estimate Std. Error t value Pr(>|t|)
X1.l1 0.96679 0.03226 29.973 < 2e-16 ***
X2.l1 0.12106 0.01672 7.241 9.07e-13 ***
X1.l2 0.04524 0.04475 1.011 0.3123
X2.l2 -0.05777 0.02337 -2.472 0.0136 *
X1.l3 -0.04889 0.03096 -1.579 0.1147
X2.l3 -0.03031 0.01731 -1.751 0.0803 .
const 2.63598 2.73796 0.963 0.3359
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.755 on 971 degrees of freedom
Multiple R-Squared: 0.9863, Adjusted R-squared: 0.9862
F-statistic: 1.166e+04 on 6 and 971 DF, p-value: < 2.2e-16
Estimation results for equation X2:
========================================
X2 = X1.l1 + X2.l1 + X1.l2 + X2.l2 + X1.l3 + X2.l3 + const
Estimate Std. Error t value Pr(>|t|)
X1.l1 0.02762 0.06228 0.443 0.65753
X2.l1 0.97667 0.03228 30.252 < 2e-16 ***
X1.l2 0.02181 0.08641 0.252 0.80079
X2.l2 0.04168 0.04513 0.924 0.35597
X1.l3 -0.03310 0.05979 -0.554 0.57997
X2.l3 -0.05587 0.03343 -1.671 0.09497 .
const 15.32998 5.28683 2.900 0.00382 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 14.98 on 971 degrees of freedom
Multiple R-Squared: 0.9564, Adjusted R-squared: 0.9561
F-statistic: 3549 on 6 and 971 DF, p-value: < 2.2e-16
Covariance matrix of residuals:
X1 X2
X1 60.15 13.46
X2 13.46 224.26
Correlation matrix of residuals:
X1 X2
X1 1.0000 0.1159
X2 0.1159 1.0000
Questions:
- Is my approach correct?
- How should I interpret Granger and VAR results?
- With this approach, how can I put
X1
andX2
in equation term? - Please let me know if I misinterpreted or missed anything.
Thanks for your time and reading.
Best Answer
Regarding (2) stationarity/unit-root testing: you say That means I can safely conclude that there is no unit root and both series are stationary. The conclusion should be the opposite: you reject the H0 for KPSS test and you cannot reject the H0 for ADF test. Both results indicate the presence of a unit root.
Regarding (3), the third and fourth lines of the code piece indicates that the best model according to AIC is with 4 lags. You would choose the model that minimizes the AIC, not maximizes it.
Regarding (4), you find cointegration, although the result is not very certain because the test statistic for
r <= 1
falls in between the 5% and 10% critical values.Regarding (5), restricting
X2
to depend only on its own lags but not on the lags ofX1
does not cause the model likelihood to drop very much, so you get an insignificant $F$ statistic. That is, you do not reject the null hypothesis thatX1
does not Granger-causeX2
. I cannot comment on whether your approach is legitimate (due to the variables being both integrated and cointegrated). However, Dave Giles has some excellent posts about Granger causality in his blog, e.g. here and here. After reading them, you should be able to understand it quite well.Regarding (6), since the series are not stationary, running VAR is not a good idea. Run a VECM instead.
I hope my notes largely answer your questions 1. through 4. I did not write out model equations for you, but I may comment after you try putting them together by yourself. (It could be beneficial to try by yourself first.)