I've fitted a model with auto arima, with independent variables with the below codes:
Regressors<- cbind(GDPr_chg,r)
arima1 <-Arima(RVD_all_chg, order=c(2,0,0),seasonal=c(0,0,1), xreg=Regressors)
Turned out the model results are as below:
ARIMA(2,0,0)(0,0,1)[4] with non-zero mean
Coefficients:
ar1 ar2 sma1 intercept GDPr_chg r
1.5285 -0.5668 -0.7920 0.0990 0.4459 -0.7613
s.e. 0.0766 0.0744 0.0724 0.0274 0.1948 0.3437
I know that xreg will fit an ARIMA model for the errors, so the model should be:
y = 0.099 + 0.4459*GDPr_chg -0.7613*r + 1.5285*u(t-1) -0.5668*u(t-2)-0.7920*e(t-4) + "random error
But I couldnt obtain the same fitted values by using the above formula.
Can anyone give me a hint?
Also, how does R fit the error terms for the first few forecasts when u(t-1) and u(t-2) are still not available [as u(t) should be the regression residual in time=t, right?]
Great thanks in advance everyone.
Best Answer
The autoregressive coefficients are weights for the observed series $y_t$ not for the residuals $u_t$. The correct representation of the model for the output that you show is:
\begin{eqnarray} \begin{array}{ll} \eta_t = y_t - 0.099 - 0.4459\, \hbox{GDPr_chg}_t + 0.7613 \, \hbox{r}_t \\ y_t = 0.099 + 1.5285\,(y_{t-1} - \eta_{t-1}) - 0.5668\,(y_{t-2} - \eta_{t-2}) + u_t - 0.7920\,u_{t-4} \end{array} \end{eqnarray}
As the model contains a MA part, there isn't a closed-form expression expression to obtain the fitted values. An iterative procedure is required.
stats::arima
uses the Kalman filter. See also the post linked by @Stat in the comments above.For illustration of a more simple case with no MA part, you may be interested in this post and this post. The latter gives a numerical example to obtain out-of-sample forecasts in an AR model with a external regressor (which is the same as for the fitted values, observed values are replaced by forecasts).
Note: there are other definitions of the ARIMA model. The signs of the coefficients and the definition of the mean/intercept may vary depending on the software implementation you are using. This answer and related posts linked here are based on the R function
stats::arima
.