Solved – Review of Box-Jenkins methodology

i just finished developing an ARMAX model with python (mostly statsmodels) in order to forecast some data. My next step is to test the data (24 time series) with the given ARMAX model. As i need to write a proper academic documentation about all tests i use and the way i test my data, i need to have a proper testing design.

I found some good designs here: http://www.autobox.com/cms/index.php/blog/entry/build-or-make-your-own-arima-forecasting-model

However, my model and testing design looks like this:

Data preparation (identification and Difference data to obtain stationary series)
- Descriptive statistics for each hour (count, mean, skewness etc)
- Augmented Dickey Fuller Test to detect stationarity of given time series

–> excel-documentation: Stationarity of time series exists!

Model Selection (Examine data, ACF, PACF to identify potential (choosing tentative p and q)
- Plot and analyse ACF and PACF
- Automatic selection of lowest Information criterion (AIC, BIC, HQIC)

–> excel-documentation: ACF and PACF plot/picture, interpretation of plot, Lowest information criterion (AIC, BIC, HQIC)

Estimation (Estimate parameters in potential model and testing. Select best model using suitable criterion Diagnostic)
- choose p- and q-parameter according to lowest AIC'

–> excel-documentation: which parameters are going to be used for arma.prediction

Diagnostic (falsification of model selection process)
- Durbin-Watson Test to detect presence of autocorrelation
- plot residuals to see structure i.e. white noise
- Normality test (D'Angelo and Pearson) to see difference from normal distribution
- qqplot of the residuals against quantiles of t-distribution (in addition to normality test)
- plot ACF and PACF of residuals to detect white noise
- Ljung-Box Test to test overall randomness based on a number of lags

–> excel-documentation: Durbin-Watson-Test-Results, Normality-Test-Results, Summary of Ljung-Box-Test (Q>0, y/n?)

Forecasting (use model to forecast)
- run model
- analyse arma.summary-table
- compare predicted value with real value (in-sample analysis)

–> excel-documentation: prediction value for given p- and q-values (see. '3. Estimation')

Verification (Mean absolute percentage error (MAPE) for in-sample analysis)
- compare predicted value with real value

–> excel-documentation: MAPE for given p- and q-values

go back to '3. Estimation' and run again if Diagnostic-results and MAPE are not satisfactory
Maximum Re-Running-Time based on optimal selection of information criterion: if model output is not satisfactory, choose higher and lower p- and q-values. Use lowest BIC and/or HQIC (if AIC, BIC and HQIC suggest same p- and q values, use different approach)

Would be great if someone can take a minute and tell me if this sounds legitimate from a academic point of view.

Tanks in advance

Best Answer

"Data preparation (identification and Difference data to obtain stationary series)" . Non-stationarity may be the symptom while the cause may be a simple change in the mean or a simple change in trend or a simple change in parameters or a simple change in error variance. Alternatively/conversely an unusual value (pulse) will increase the variance and increase the covariance thus the acf will be downwards biased yielding possibly false conclusions about non-existent ARIMA structure. Either way your design does not understand/follow the flow charts presented in your reference.

Best Answer

Related Solutions

Time Series – Box-Jenkins Model Selection

Solved – trust a regression if variables are autocorrelated

Related Question