Time-Series – Identification Problem in SARIMA: How to Address and Solve

acf-pacfarimamodel selectionseasonalitytime series

I am trying to make time series analysis with SARIMA and I have a question. My dataset is a seasonal dataset. I validated that I have stationary series by KPSS test.

enter image description here

I also found the following results:

ndiffs(ts) #number of regular difference
[1] 0
nsdiffs(ts) #number of seasonal difference
[1] 1

According to the results, I took the seasonal difference of the dataset, then I drew ACF and PACF of differenced time series:

ACF of Differenced TS
PACF of Differenced TS
I think I couldn't make suitable model identification. I thought that following three model could fit the dataset.

SARIMA(1,0,1)(1,1,1)[12] SARIMA(1,0,2)(1,1,1)[12] SARIMA(1,0,3)(1,1,1)[12]

However, when I summary of the three model I got the following results:

1)
Summary1

2)
Summary3

3)
Summary2

Also, I used auto.arima but I found that model is insignificant as well.
I think I am missing something because I am very new to this field. Can somebody have an idea?

Edit:

I also used seasonal dummy variables thanks to the advices of @richard
As a result of regression, all seasonal dummies are significant and model has 95% R^2 value.
When I draw the ACF and PACF functions of residuals of the regression model, I got the following plot:

ACF function of residuals of regression
PACF function of residuals of regression

Best Answer

This looks like a case of overdifferencing; notice the high and statistically significant negative partial autocorrelation at the seasonal frequency 12. The KPSS test had the correct indication of stationarity, while the subsequent seasonal differencing assessment produced a contradicting and, I believe, misleading result. (You should have noticed the contradiction, as a series cannot be both stationary and seasonally integrated at the same time.)

The original plot shows quite clearly that the series is stationary around a seasonal pattern rather than being a combination of 12 random walks (one in each season) which would be the case if the series were seasonally integrated. Seasonal differencing is thus not warranted, and use of seasonal dummies, Fourier terms or similar should do the job.

Related Question