Solved – Treating non-stationarity of time series in seasonal adjusted data with R

rregressionseasonalitystationaritytime series

I'm currently trying to use a variable x (and others) to explain a dependent variable y in a distributed lag model (with the long term goal of predicting variable y). The plot of variable x shows an evident seasonality at the end of the year: See http://i.imgur.com/8gGtaVS.png

After having deseasonalized the data with decompose (with multiplicative components), adf and kpss tests indicate, that the seasonal adjusted data is still not stationary. Because there are more independent variables and I don't want to look deeper into investigation a cointegration relationship between those series, I thought it would be the most usual way to take the difference of both series (with the diff() function).

Now there are 2 alternatives:

  1. Take the diff of the already seasonal adjusted data. The problem with this approach is, that I'm not sure if this a good idea because I don't see how you can reseasonalize the time series for a forecast result in an easy manner.
  2. Take the diff of the raw series. This leads to the following graph. See http://i.imgur.com/loDU2IE.png
    The time series is now stationary regarding adf and kpss tests, however there is still the seasonal pattern visible. Now I'm not sure if it is recommended to use decompose (with the multiplicative) method, to deseasonalize the diff of the time series, especially because there are zero values which have no effect when calculating the seasonal adjusted time series in the following way:
decompose(x_diff, "mult")$x / decompose(x_diff, "mult")$season

So, how should I proceed when I want to include (the diff of) x as a independent (and lagged) variable in a distributed lag model?

Best Answer

  1. Try a seasonal ARIMA model. The auto.arima() function in the "forecast" package has an automatic algo. The function documentation points to a paper by Hyndman and Khandakar.

  2. There also exist seasonal differencing and seasonal unit root tests. These tests are called: Canova-Hansen, OSCB, HEGY and a little known one by Kunst (1997). Search for them in Google scholar. D. Osborne is well known in this field. There is also a great time series book by Phillip Hans Franses. You should be looking for a seasonal unit root and seasonal lags. It is also possible to have both a seasonal unit root and a non-seasonal unit root. Likewise, both seasonal lags and non-seasonal lags.

  3. Try an ETS model. It is implemented in the ets() function of the forecast package. Again, the documentation points to the literature.