I am trying to use statsmodels Granger Causality test: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.grangercausalitytests.html
to assess whether "positivity score" affects value.
Here is the code I am using:
# Applying differencing
condensed_df['value'] = condensed_df['value'] - condensed_df['value'].shift(1)
condensed_df = condensed_df.drop(0)
# Running granger causality test
dct_pos_granger_causality = grangercausalitytests(condensed_df[["value", daily_avg_positive_score"]], maxlag = 4, verbose=False)
I have a total of 1,008 rows in the dataframe.
The results are as follows:
{1: ({'ssr_ftest': (0.005356633438031601, 0.941670291866298, 1003.0, 1), 'ssr_chi2test': (0.0053726552728412666, 0.9415686658133314, 1), 'lrtest': (0.005372640925997985, 0.9415687436896775, 1), 'params_ftest': (0.0053566334379265765, 0.9416702918669032, 1003.0, 1.0)})
2: ({'ssr_ftest': (0.25177289420871873, 0.7774705403356538, 1000.0, 2), 'ssr_chi2test': (0.5060635173595247, 0.7764432226205071, 2), 'lrtest': (0.5059361470375734, 0.7764926721067107, 2), 'params_ftest': (0.25177289420872345, 0.7774705403356538, 1000.0, 2.0)})
3: ({'ssr_ftest': (0.24649533124441178, 0.8638565929333925, 997.0, 3), 'ssr_chi2test': (0.7446779716230374, 0.862648253967841, 3), 'lrtest': (0.7444019401355035, 0.8627137383746588, 3), 'params_ftest': (0.2464953312443746, 0.8638565929334187, 997.0, 3.0)})
4: ({'ssr_ftest': (0.6384235515822775, 0.6351740781255001, 994.0, 4), 'ssr_chi2test': (2.576816186064484, 0.6309354793595714, 4), 'lrtest': (2.57351178378849, 0.6315224927789413, 4), 'params_ftest': (0.6384235515823179, 0.6351740781254609, 994.0, 4.0)})}
I am struggling to interpret the results, am I correct in thinking that, taking the 1st ssr_chi2test as an example, (0.0053726552728412666, 0.9415686658133314, 1), that 0.005 represents the test statistic, 0.94 the P-value and 1 the degrees of freedom?
If this is correct, then the null hypothesis can absolutely not be rejected, and potentially there is not enough data given that there is only one degree of freedom?
Any clarity would be appreciated!
Best Answer
A look into the documentation of
grangercausalitytests()
indeed helps:So yes your interpretation concerning the test output is correct. Further note that depending on whether the test is based on the $\chi^2$- or F-distribution, you will have one or two numbers for the degrees of freedom (as can be seen in the different dimensions of the test tuples).
However, inferring from the degrees of freedom that you do not have enough data does not make sense here. That is because the degrees of freedom for the $\chi^2$- and LR-test are basically the number of imposed zero restrictions which coincide here with the number of lags. Thus the degrees of freedom only vary with the number of lags and do not depend on the number of observations. In fact, I would even argue that about 1000 observations is a usual order of magnitude in such a setting and thus "enough data".