I have been trying to understand the kpss test and I have read this answer and have been reading information from this KPSS Test: Definition and Interpretation , but am still confused about my own results.
I am confused because the critical values for 1,5,10% etc. I receive on my own data do not follow the order I would expect.
The null hypothesis is that the data is stationary.
I reject the null hypothesis if the test statistic > the critical value.
I would expect it to be "easier" to reject the null at the 1% and "harder" to reject the null at 10%. In other words I expect the critical values to be in a sequence 1%<2.5%<5%<10%
I have a series of hourly temperature data and if I run the KPSS test from statsmodels I get the following output
kpss(series, lags = 230)
(0.70411510645393605,
0.013171353958733086,
230,
{'1%': 0.739, '10%': 0.347, '2.5%': 0.574, '5%': 0.463})
Do my results mean that I would reject the null hypothesis at the 10%, but not at the 1%?
Also in the KPSS Test: Definition and Interpretation table for the critical values they are in the order 1%>5%>10%, which would imply at a given test statistic you can reject at the 5 or 10%, but not at the 1%.
Can someone tell me what I am missing?
Best Answer
As already mentioned in the comments the statistic value have to be more extreme than the chosen critical value. In your linked blog there is good image describing it:
The statistic value determines the position in this probability distribution. As you can see a more extreme value can be lower or higher than your signifance level (left or right tail of the distribution).
Because the distribution has zero mean you can just use the absolute value of both test value and critical value:
|kpss_val| > |critical_value| = null rejected
You may also use the p value which is returned by the statsmodel implementation. Note that it's only in the range [0.01, 0.1]. You can reject it with p=0.01 and you may not reject it at p=0.1.
p < significance_level = null rejected
If it's still not clear I propose reading the related chapter in the Wikipedia article. It explains more the intuion behind the hypothesis test and its rejection.
Concluding here the code I came up with for my time series analysis (testing KPSS and ADF):
Output:
I hope I did everything right. Otherwise don't hesitate to give me feedback so we have a final answer to this problem.
Greetings, Thomas