I have a problem with the forecast
function for ARIMA models in R. It calls predict
that calls KalmanForecast
. Ok…here's the deal.
the mean one-step forecast of the Arima object produced by this call
forecast(Arima, h=1)$mean[[1]]
is often significantly different from the result of a manual forecast by conditional expected value (best linear predictor).
For example a non seasonal Arima(1,1,1) without drift has of course the structure
y[t] = y[t-1] + AR1*(y[t-1] - y[t-2]) + MA1*epsilon[t-1] + epsilon[t]
so the one-step prediction is very straightforward
y[t] = y[t-1] + AR1*(y[t-1] - y[t-2]) + MA1*epsilon[t-1]
but this result is always different from the result of the forecast function call.
Is it due to approximation errors in the Kalman recursion?
Try yourself with this code, it only needs the forecast
package
x = arima.sim(n = 1000, list(ar = c(0.8897, -0.4858), ma = c(-0.2279, 0.2488)),sd=sqrt(0.1796))
t = length(x) + 1
Arimafit = Arima(x = x, order = c(1,1,1), seasonal = list(order = c(0,0,0), period = 1), include.mean = FALSE,include.drift = FALSE)
manualforecast = x[t-1] + coef(Arimafit)[["ar1"]]*(x[t-1] - x[t-2]) + coef(Arimafit) [["ma1"]]*Arimafit$residuals[t-1]
autoforecast = forecast(Arimafit, h = 1)$mean[[1]]
autoforecast
is always different from manualforecast
, sometimes significantly.
Best Answer
In this example, the model is mis-specified, and so the fitted coefficients might be close to the boundary of the stationarity or invertibility regions. This will cause instability in the computations.
I tried running your code twice. The first time, the estimated MA coefficient was
-0.999999824
which is on the edge of the invertibility region. In that case, the two forecasts differed:The second time I ran the code, both coefficients were well inside the boundaries, and the forecasts matched: