Conflicting ACF/PACF after first-difference

acf-pacfarimadifferencingtime series

I have yearly data. When I do a Dickey-Fuller test it gives me insignificant results, indicating that the series are non-stationary.
After differencing them the DFT tells me they are now significant and stationary.

When using an ACF/PACF plot on the original series it is obvious that an AR(1) process should be used.

When plotting ACF/PACF for the first difference I can't make that same inference again because there is no significant spike in the ACF.

Best Answer

There is nothing conflicting here. Differencing a unit root process removes the unit root. If the original process is AR(1) with a unit root, $y_t=y_{t-1}+\varepsilon_t$, the differenced process is just white noise, $\Delta y_t=\varepsilon_t$. This is what you see in the ACF and PACF for the differenced data.

Related Solutions

Solved – How to set (p,d,q) and (P,D,Q) for SARIMA time series model

The Box-Jenkins (ARIMA) model identification procedure consists of the following three stages.

Identification consists of using the data and any other knowledge that will tentatively indicate whether the time series can he described with a moving average (MA) model, an autoregressive (AR) model, or a mixed autoregressive – moving average (ARMA) model.
Estimation consists of using the data to make inferences about the parameters that will be needed for the tentatively identified model and to estimate values of them.
Diagnostic checking involves the EXAMINATION of residuals from fitted/tentative models, which can result in either no indication of model inadequacy or model inadequacy, together with information on how the series may be better described.

It is an ITERATIVE process yielding possible latent structure such as pulses, level/step shifts, seasonal pulses and local time trends while validating BOTH

1) constant parameters through time

and

2) constant error variance through time.

https://autobox.com/pdfs/ARIMA%20FLOW%20CHART.pdf details the iterative sequence.

When you post your data , I will attempt to highlight specific decision points.

EDITED AFTER RECEIPT OF DATA (2289 monthly values):

The DF test that you referred to reflects only tests for the need for differencing and ignores seasonal dummies/pulses as possible remedies for non-stationarity.

I used AUTOBOX my tool of choice ( which I have helped to develop ) to iteratively AND logically step through the ARIMA model building process.

The first step is to assess dominance of ARMA structure versus latent deterministic structure by comparing possible error variances from both. The conclusion is that monthly effects (NOT MONTHLY MEMORY) dominates. This is no surprise as it is common knowledge that the month-of-the-year effects are the most important factor when planning a trip to the Cayman Islands not just what occurred last year.

Note that monthly averages ( read : "seasonal pulses" ) are used as an aid to predict/forecast temperature

A partial model list is here suggesting a level shift at or about 1919/6 while incorporating 11 seasonal dummies

The first step yields a set of residuals suggesting the need for possible model augmentation i.e. an ar(1) component effectively adding memory to the model .. and here

The augmented model (1,0,0)(0,0,0)12 with 11 seasonal dummies and one level/step shift is shown here

The Tsay Test for constant error variance suggests a significant error variance reduction at or around period 469 . This test is chronicled here http://docplayer.net/12080848-Outliers-level-shifts-and-variance-changes-in-time-series.html .

Here is the acf of the current model residuals

We proceed to evaluate possible anomalies that might need special attention . Here is the list of one-time pulses that need to be adjusted for in order to obtain robust parameters enabling meaningful tests of significance

Finally we have a useful model with residual plot here with a forecast plot here for the next 36 months and residual histogram here

In summary ... evaluate possible alternative strategies and then much like peeling an onion .. iterate until the error process is free of information suggesting model sufficiency.

Finally the data is non-stationary because there are identifiable fixed/deterministic (read monthly) effects and a level/step shift and a deterministic break-point in error variance.

Here is the Actual/Fit and Forecast graph

Best Answer

Related Solutions

Solved – How to set (p,d,q) and (P,D,Q) for SARIMA time series model

Related Question