Solved – auto.arima doesn’t calculate AIC values for the majority of models

rtime series

I have manually discovered that the best model for my time series is the next one (AIC = 244.9):

enter image description here

But auto.arima function tells me that the best model is (0,1,0) with AIC = 247.93:

enter image description here

If I look at the trace I see:

enter image description here

Why can't auto.arima calculate AIC values for the majority of the models and returns Inf?

Best Answer

This is probably explained in the documentation. Looking the source code I found that Inf is reported when the likelihood of the model turns out to be infinity or when the lowest root in the polynomials of the model are lower than 1.01.

When the AR polynomial is close to be non-stationary or when the MA polynomial is close to be non-invertible, then the model is rejected by setting an infinite value for the AIC related to that model.

Inf * is reported when the ARIMA model couldn't be fitted and an error was returned by stats::arima.

For example, the following reports a value of the AIC equal to Inf for the model ARIMA(2,1,2):

x <- diff(log(AirPassengers), 12)
auto.arima(x, ic="aic", seasonal=FALSE, allowdrift=FALSE, trace=TRUE)
# ARIMA(2,1,2)                    : Inf
# ARIMA(0,1,0)                    : -428.4098
# ... ... ...
# ARIMA(3,1,3)                    : -447.2594
# ARIMA(3,1,2)                    : -446.4202
# Best model: ARIMA(3,1,3)

Fitting this particular model, we can see that the MA polynomial is close to be non-invertible, that's why auto.arima sets a large value to the AIC in order to make sure that this model is not chosen:

fit <- arima(x, order=c(2,1,2))
fit
# Coefficients:
#          ar1     ar2      ma1      ma2
#       0.2336  0.4912  -0.6445  -0.3554
# s.e.  0.2514  0.1824   0.2776   0.2761
AIC(fit)
# [1] -450.962

However, we can see that the MA polynomial is close to be non-invertible, that's why auto.arima sets a large value to the AIC in order to make sure that this model is not chosen:

# Roots of the AR polynomial (stationary)
abs(polyroot(c(1,-coef(fit)[c("ar1", "ar2")])))
# [1] 1.208756 1.684228
# Roots of the MA polynomial (the first is close to unity, < 1.01)
abs(polyroot(c(1,coef(fit)[c("ma1", "ma2")])))
# [1] 1.000013 2.813392

Related Solutions

ARIMA – Should Auto.arima in R Report Models with Higher AIC, AICC and BIC

auto.arima uses some approximations in order to speed up the processing. The final model is fitted using full MLE, but along the way the models are estimated using CSS unless you use the argument approximation=FALSE. This is explained in the help file:

approximation If TRUE, estimation is via conditional sums of squares and the information criteria used for model selection are approximated. The final model is still computed using maximum likelihood estimation. Approximation should be used for long time series or a high seasonal period to avoid excessive computation times.

The default setting is approximation=(length(x)>100 | frequency(x)>12), again this is specified in the help file. As you have 17544 observations, the default setting gives approximation=TRUE.

Using the approximations, the best model found was a regression with ARIMA(5,1,0) errors with AICc of 2989.33. If you turn the approximations off, the best model has ARIMA(2,1,1) errors with an AICc of 2361.40.

> fitauto = auto.arima(reprots[,"lnwocone"], approximation=FALSE,
                xreg=cbind(fourier(reprots[,"lnwocone"], K=11),
                reprots[,c("temp","sqt","humidity","windspeed","mist","rain")]),
                start.p=1, start.q=1, trace=TRUE, seasonal=FALSE)
> fitauto
Series: reprots[, "lnwocone"] 
ARIMA(2,1,1) with drift         
...
sigma^2 estimated as 0.08012:  log likelihood=-1147.63
AIC=2361.27   AICc=2361.4   BIC=2617.76

Solved – Ljung-Box always significant for ARIMA models – what now

A note on terminology: commonly we fit a model to the data rather than fit the data to a model.

I can do step 1, but don't know how to relate that to step 2. Am I using the remainder from stl analysis for ARIMA modeling? If not, what's the point of step 1?

From STL you obtain three components: trend, seasonal and remainder. You could remove the seasonal component and use the sum of trend and remainder for further modelling with ARIMA.

But I can't get past the diagnostics. My Ljung-Box values are ALWAYS significant for ALL lags. Okay, so that means my residuals are correlated (I think). And since I want to use the residuals for cross-correlation, I assume that's bad.

Yes, having significant autocorrelations for ALL lags is clearly a problem. I would generally agree with the comment by @Glen_b, but in a case where all lags are significant the problem seem hard to deny. Curiously, the ACF plot does not immediately suggest that the autocorrelations are a really big problem (only a few lags stick outside the confidence interval by much) and the latter only becomes evident from the Ljung-Box test. I would not stop there and I would not accept a model with such a terrible Ljung-Box picture. Instead, I would look for other models.

One caveat: if you use STL and remove the seasonal component before estimating ARIMA models on trend+seasonal, you should not allow for a seasonal component in the ARIMA model (making it a SARIMA model); use option seasonal=FALSE in function auto.arima. Perhaps making this change will help you find better models.

Note also that after taking the 24-hour difference, the ACF and PACF still have significant 24-hour lags. This may indicate that taking the 24-hour difference was not such a good idea. Normally you would expect the lag at which you have differenced the data to not have significant ACF or PACF value.

Does this mean my time series doesn't fit an ARIMA model?

The model you showed us indeed does not seem to fit the data well as evidenced by the poor Ljung-Box statistics. If I were you, I would try some other model instead.

Best Answer

Related Solutions

ARIMA – Should Auto.arima in R Report Models with Higher AIC, AICC and BIC

Solved – Ljung-Box always significant for ARIMA models – what now

Related Question