Solved – Using AIC to determine best ARIMA Model

arimatime series

I'm trying to fit an ARIMA model to housing data set.
Playing around with the p's and q I was able to get an ARIMA Model (2,1,2,)(2,0,0) with an AIC value of AIC=4946.76
I used auto.arima to see if I picked the best model. auto.arima picked the (2,1,3)(2,0,0) model that had an AIC value of AIC=4948.21 .

Then I looked at the values for both models to see with the difference was between the two.
The ARIMA (2,1,2)(2,0,0) model had an error

Warning message:
In sqrt(diag(x$var.coef)) : NaNs produced

My question is why did auto.arima pick the (2,1,3)(2,0,0) model instead of (2,1,2)(2,0,0)?

Best Answer

auto.arima will do some things like use approximations, in order to speed things up. You can try using auto.arima(data, approx=FALSE, stepwise=FALSE) to turn off some of the approximating to deal with the error, which is likely caused by coefficients being close to the edge of the stationarity region. As a warning, this may take longer than normal. You could try just approx=FALSE first.

You can use auto.arima(...)\$aic to get the actual value of the AIC, maybe it is very slightly smaller for $q=3$. As the values are almost exactly the same, it probably doesn't matter too much the value of $q$. If you think $q=2$ from "playing around", then $q=2$ is fine. Time series is not an exact science and there is a small amount of subjectivity involved. As long as you justify why you chose $q=2$ and you do the correct model diagnostics (for example, looking at the residuals), then there is no need to worry.

Related Question