Solved – Testing for Granger Causality

granger-causalitypythonstatsmodels

I have two time series (Stocks and GDP) that I want to check for Granger causality. After reading the literature and documentations of various statistics software documentations (py statsmodels), I'm a little puzzled: What are the necessary steps for conducting a Granger causality test?

  1. First, I understand that the time series should be both
    stationary if we want to measure Granger causality. Here, the ADF test is a Unit root test that checks whether a time series is
    stationary or not. In my case, both time series are stationary at
    level.
  2. Second, I should check for the lag order to determine the
    maximum lag length for the Granger causality analysis. I do that via
    model.select_order(10) in Python statmodels and check which lags are indicated, for example by AIC and BIC.
  3. Now, how about cointegration? How do I check for cointegration
    in IPython and how do I interpret the results? I read in the literature about the "order of integration", written like I(0),I(1),I(2). I do not really understand what it means and how to produce the measure.

Otherwise, I'm well informed about how to check Granger itself and how to interpret its results.

Thanks for your help!

Best Answer

Follow this procedure (Engle-Granger Test for Cointegration):

1) Test to see if your series are stationary using adfuller test (stock prices and GDP levels are usually not)

2) If they are not, difference them and see if the differenced series are now stationary (they usually are).

3) If they are, your ORIGINAL series are said to be each integrated (I did not say co-integrated) of order 1; concisely noted as I(1).

4) If they are not both I(1), you can say safely say that they can not be co-integrated of order 1.

5) If they are both I(1), run a simple OLS regression of one of the other.

6) Check the residual of the OLS for stationarity. If they are stationary, then your original series are co-integrated of order 1.

Shortcomings of this method: 1) It may matter which variable you regress on the other, 2) it works only when you have two variables.

For a better test, you can use Johansen's procedure (https://github.com/josef-pkt/statsmodels/commit/29f0aa27d284ac0026e90ff9d877f7920a2c6056)

http://nbviewer.jupyter.org/github/mapsa/seminario-doc-2014/blob/master/cointegration-example.ipynb.