Solved – auto.arima warns NaNs produced on std error

arimarregression

My data is a time series of employed population, L, and the time span, year.

n.auto=auto.arima(log(L),xreg=year)
summary(n.auto)
Series: log(L) 
ARIMA(2,0,2) with non-zero mean 

Coefficients:
         ar1      ar2      ma1     ma2  intercept    year
      1.9122  -0.9567  -0.3082  0.0254    -3.5904  0.0074
s.e.     NaN      NaN      NaN     NaN     1.6058  0.0008

sigma^2 estimated as 1.503e-06:  log likelihood=107.55
AIC=-201.1   AICc=-192.49   BIC=-193.79

In-sample error measures:
           ME          RMSE           MAE           MPE          MAPE 
-7.285102e-06  1.225907e-03  9.234378e-04 -6.836173e-05  8.277295e-03 
         MASE 
 1.142899e-01 
Warning message:
In sqrt(diag(x$var.coef)) : NaNs produced

why does this happen? Why would auto.arima selects the best model with std error of these ar* ma* coefficients Not a Number? Is this selected model valid after all?

My goal is to estimate the parameter n in the model L=L_0*exp(n*year). Any suggestion of a better approach?

TIA.

data:

L <- structure(c(64749, 65491, 66152, 66808, 67455, 68065, 68950, 
69820, 70637, 71394, 72085, 72797, 73280, 73736, 74264, 74647, 
74978, 75321, 75564, 75828, 76105), .Tsp = c(1990, 2010, 1), class = "ts")
year <- structure(1990:2010, .Tsp = c(1990, 2010, 1), class = "ts")
L
Time Series:
Start = 1990 
End = 2010 
Frequency = 1 
 [1] 64749 65491 66152 66808 67455 68065 68950 69820 70637 71394 72085 72797
[13] 73280 73736 74264 74647 74978 75321 75564 75828 76105

Best Answer

The sum of the AR coefficients is close to 1 which shows that the parameters are near the edge of the stationarity region. That will cause difficulties in trying to compute the standard errors. However, there is nothing wrong with the estimates, so if all you need is the value of $L_0$, you've got it.

auto.arima() takes a few short-cuts to try to speed up the computation, and when it gives a model that looks suspect, it is a good idea to turn those short-cuts off and see what you get. In this case:

> n.auto <- auto.arima(log(L),xreg=year,stepwise=FALSE,approx=FALSE)
> 
> n.auto
Series: log(L) 
ARIMA(2,0,0) with non-zero mean 

Coefficients:
         ar1      ar2  intercept    year
      1.8544  -0.9061    11.0776  0.0081
s.e.  0.0721   0.0714     0.0102  0.0008

sigma^2 estimated as 1.594e-06:  log likelihood=107.19
AIC=-204.38   AICc=-200.38   BIC=-199.15

This model is a little better (a smaller AIC for example).

Related Solutions

Solved – time series forecasting using auto.arima and exponential smoothing

Seasonality is probably not very strong. Different algorithms will give different results, unless seasonality is glaringly obvious.
The best measure is always to compare forecast accuracy on a holdout set: hold back the last $n$ observations, fit your models to all other observations, forecast into the last $n$ time periods with both models, then compare forecast accuracy using your error measure of choice (see 5 below).
Yes, this is a common complaint. I don't think there is an easy way to get the in-sample fit. But you can get the residuals: auto.arima(WWWusage)$residuals. Best to look into the code of auto.arima() to see whether you need to add or subtract them from the original series to get the fit. I'd say you have to subtract ("actuals=model+residuals"), but better check.
I recommend a good forecasting textbook. This is a very good start. Otherwise, read through the help pages.
The appropriate error measure will depend on your personal loss function. Is your pain symmetric, and will it increase more strongly with larger errors? Then use MSE. Is your pain proportional to absolute errors? Then use MAE. Best to look at multiple error measures.

One tip: averaging forecasts will usually improve accuracy. Consider taking the average of your two models' forecasts per future time bucket.
auto.arima() apparently fits no drift, even if you allow it.

Solved – Error in optim – Auto.arima R

The following works in the sense that it returns a valid model:

library(forecast)
sku1 <- read.csv("Sample.csv")
actual_val = ts(sku1$Sal , frequency = 52)
dummy_val = seasonaldummy(actual_val) 
model_value <- auto.arima(actual_val , xreg = dummy_val )

Perhaps your problem is that you haven't defined xreg_val.

However, it isn't a sensible model as you use 51 degrees of freedom for seasonality but have only two years of daily data. I suggest you use Fourier terms instead:

seas <- fourier(actual_val, K=10)
model_value <- auto.arima(actual_val , xreg = seas , lambda=0) 
plot(forecast(model_value, xreg=fourierf(actual_val, h=52, K=10), lambda=0))

Best Answer

Related Solutions

Solved – time series forecasting using auto.arima and exponential smoothing

Solved – Error in optim – Auto.arima R

Related Question