Solved – Steps to perform time series analysis

autoregressivetime series

I'm trying estimate an autoregressive model with an exogenous variable. It's about the impact of changes in oil prices on the economy. I'm planning on regressing gdp growth rate on its own lagged values and lagged values of oil price.

I don't know from where to start. Here's what I think so far:

  1. I should check if the gdp and oil price are stationary, right? (dickey-fuller test??)
  2. Find the number of lags for gdp and oil price.( AIC criterion?.)
  3. Estimate the model.
  4. Test the signification of the estimators, heteroscadesticity and then serial correlation.

Should I test for the presence of cointegration? later I'm planning on using the same variables in a Markov switching model. I have the same question for Grangers causality test.

I've been reading a few books and watching more than one video about times series, but I still can't find my way through it. Can you please guide me and see if there is something missing in those steps?

Best Answer

It sounds like you want to fit an ARIMAX model to your time series. I would try to fit an ADL (auto-regressive distributed lag) model, an ECM (error correction model) or apply the Engle-Granger 2-step analysis to the series to see if your series cointegrate and to estimate the long-run relationship between them in case they do. If they do not cointegrate then continue with the ARIMAX model or estimate stationary ADL or ECM models. Note that an ADL model and the ARIMAX model are very similar. Although cointegration analysis with several variables is quite an endeavour and fills up entire text books (see e.g. Katarina Juselius' “The Cointegrated VAR Model: Methodology and Application) cointegration analysis with only two variables is quite fast and easy depending on what approach you want to use. Note that a part of my answer is the same as I answered in another question on a similar question. I will outline the steps you should follow in order to model the time series appropriately.Remember firstly that there are different kinds of non-stationarity and different ways on how to deal with them. Four common ones are:

1) Deterministic trends or trend stationarity. If your series is of this kind de-trend it or include a time trend in the regression/model. You might want to check out the Frisch–Waugh–Lovell theorem on this one.

2) Level shifts and structural breaks. If this is the case you should include a dummy variable for each break or if your sample is long enough model each regimé separately.

3) Changing variance. Either model the samples separately or model the changing variance using the ARCH or GARCH modelling class.

4) If your series contain a unit root. In general you should then check for cointegrating relationships between the variables but since you are concerned with univariate forecasting you shoud difference it once or twice depending on the order of integration.The steps to model the series:

1) Look at the ACF and PACF together with a time series plot to get an indication on wheter or not the series is stationary or non-stationary. If the ACF decays very slowly and the TS plot looks like it exhibiting a unit root (not mean reverting) then this is a good indication that the series do not include a unit root.

2) Test the series for a unit root. This can be done with a wide range of tests, some of the most common being the ADF test, the Phillips-Perron (PP) test, the KPSS test which has the null of stationarity or the DF-GLS test which is the most efficient of the aforementioned tests. NOTE! That in case your series contain a structural break these tests are biased towards not rejecting the null of a unit root. In case you want to test the robustness of these tests and if you suspect one or more structural breaks you should use endogenous structural break tests. Two common ones are the Zivot-Andrews test which allows for one endogenous structural break and the Clemente-Montañés-Reyes which allows for two structural breaks. The latter allows for two different models. An additive outlier model which accounts for sudden changes in the slope of the series and an innovative outlier model which takes gradual changes into account and allows a break in the intercept and slope. Look these tests up on Wikipedia or in some econometrics text book. Some statistical packages have these tests built in which makes conducting a battery of unit root test on your series very easy.

In case your series contain a unit root then test the first differences of your series in orer to see if they contain a second unit root.

3) In case your series are non-stationary then you should:

        A) Apply the Engle-Granger 2-step procedure

        B) Apply an ADL model

        C) Apply an ECM model

Note that you could use the Johansen cointegration test or some other tests but for simplicity these are left out and in your case where you only have two time series either one of A), B) and C) will suffice. Note that although the Engle-Granger procedure is easier to apply (at least I think so) the ADL/ECM estimators are prefferable as can be seen by conducting a Monte Carlo simulation.

I will not explain all these approaches and how to derive the long-run solution as that would take a considerately amount of time and space but here is an excellent link in order to introduce these methods:

http://www.econ.ku.dk/metrics/Econometrics2_07_I/LectureNotes/Cointegration.pdf

4) The amount of lags you include should be picked so that you eliminate all residual autocorrelation when picking lags for your ADL model.

5) After your cointegration analysis you are more or less done. Please note that in case you want to expand your model to several variables you should use the CVAR model and the analysis gets a lot more complicated as mentioned above.

6) In case your variables do not cointegrate but contain a unit root then continue with your ARIMAX modelling

        A) Difference the series

        B) Choose lag length according to the ACF and PACF. Pick the best model according to the AIC, BIC or HQ criterions and test for residual autocorrelation using the Ljung-Box Q test. Test the significance of your variables.

        C) Estimate and ADL/ECM model to your data. Include lags so to remove serial correlation and do tests on variable significance.

7) In case of stationary variables estimate a stationary ADL/ECM model for your data or proceed with your ARIMAX. Same steps as in 6). An excellent introductionary note on the stationary models can be found here: http://www.econ.ku.dk/metrics/Econometrics2_07_I/LectureNotes/dynamicmodels.pdfIn case your series contain a unit root with a drift or no unit root but a deterministic trend you can add a time trend to your specification. Further, check the first differences of the series and the time series plots to see if your series contain a structural break and/or outliers and include dummy variables for these. Note that you should test for structural breaks, see point 2) above. Another alternative is the Chow test. Thirdly it could be an idea to take natural logs of your variables as this will stabilize the variance of the series. The log transformation will not change anything as its a monotonic transformation.

Hopefully this made some sense. Please note that this was a very short introduction and that this could easily fill several chapters in a textbook. I will strongly recommend to read those two lectur notes I posted links to or that you get hold of a textbook on time series analysis/econometrics. If you need help to understand some of the concepts better then please feel free to ask! Model specifications and examples are all included in the lecture notes I linked to.

Related Question