I am fairly new to time series analysis. I am using hourly data for six months time period. My time series has seasonality every week. Per Dr.Robert Hyndman
I set up my time series variable and then did a kpss.test,
cooltemp = ts(training$vol, frequency = 168)
checkl <- kpss.test(cooltemp, null = "Level")
checkt <- kpss.test(cooltemp, null = "Trend")
output ofcheckl
was,
KPSS Test for Level Stationarity
data: cooltemp
KPSS Level = 4.0716, Truncation lag parameter = 13, p-value = 0.01
output of check checkt
was ,
KPSS Test for Trend Stationarity
data: cooltemp
KPSS Trend = 1.24, Truncation lag parameter = 13, p-value = 0.01
There is a weekly seasonality in the data and I assume for checkt , it should be "p-value greater than printed p-value" for Trend.
The reason is, i checked with the example,
x <- 0.3*(1:1000)+rnorm(1000)
xt <- kpss.test(x, null = "Trend")
xl <- kpss.test(x, null = "Level")
The result of xt
is,
KPSS Test for Trend Stationarity
data: x
KPSS Trend = 0.042451, Truncation lag parameter = 7, p-value = 0.1
that shows that Trend station null hypothesis can be accepted. And the result of xl
is
KPSS Test for Level Stationarity
data: x
KPSS Level = 12.598, Truncation lag parameter = 7, p-value = 0.01
The result of null hypothesis level can be rejected.
I am confused on how to interpret in my case, where I get both are 0.01, where I have to reject both of my null hypothesis.
Best Answer
Firstly, If you dataset contains hourly data then the frequency of the time series should be 24.
cooltemp = ts(training$vol, frequency = 24)
Secondly, It is preferable for KPSS test, that you reject null hypothesis if
p-value < 0.05
. p-value greater than printed p-value only indicates that the value has been rounded off to two decimal points. Your value forcheckt
will not be greater that 0.05 even if the statement was printed in the console.From the above outputs for
checkl
andcheckt
, we can say that your series is non-stationary on level and trend. You should consider differencing the time-series to make it stationary.Since you know there is weekly seasonality in the series, therefore I would suggest taking a seasonal difference as in the following r code:
If you wish to check the order of differencing and seasonal differencing are required to make the series stationary, you can try the following code:
ns <- nsdiffs(cooltemp) if(ns > 0) { xstar <- diff(x,lag=frequency(x),differences=ns) } else { xstar <- x } nd <- ndiffs(xstar) if(nd > 0) { xstar <- diff(xstar,differences=nd) }
Source: Hyndman, R. J., & Athanasopoulos, G. (2016). 8.1 Stationarity and differencing. In Forecasting: principles and practice. Heathmont: OTexts.