I'm using R to do some time series estimation. I'm trying to rebuild the fitted values from an Arima model by hand to use in an Excel spreadsheet using the estimated coefficients and the input data. I can use the fitted command, but I'm trying to understand more how it works. Ex:


N = ts(mvrnorm(50, mu=c(0,0), Sigma=matrix(c(1,0.56,0.56,1), ncol=2), 
       empirical=TRUE), frequency=12)

>            [,1]       [,2]
>[1,] -0.05270976  0.7239571
>[2,] -0.67232349 -0.6631604
>[3,] -0.20193415  0.8176053
>[4,] -0.54278281 -2.0458285
>[5,]  1.38279994  0.9405811
>[6,]  1.39979731  2.1717733

# Model: x(t) = a * x(t-1) + e(t)
fit = Arima(N[,1], order=c(1,0,0), include.constant=FALSE)

> fit  
>Series: N[, 1]  
>ARIMA(1,0,0) with zero mean          
>         ar1  
>       0.0293
>s.e.   0.1400  
>sigma^2 estimated as 0.9791:  log likelihood=-70.42
>AIC=144.84   AICc=145.1   BIC=148.66

# Build the fitted values: x(t)=a * x(t-1) 
pred  = fit$coef[1] * lag(fit$x, -1) 
pred1 = fitted(fit)
head(cbind(pred, pred1))   

>             pred         pred1
>[1,]           NA -2.255567e-05
>[2,] -0.001541849 -1.541849e-03
>[3,] -0.019666597 -1.966660e-02
>[4,] -0.005906915 -5.906915e-03
>[5,] -0.015877313 -1.587731e-02
>[6,]  0.040449232  4.044923e-02 

In this case, pred and pred1 match.

However when I add in an xreg:

# Model: x(t) = a*x(t-1) + b*xreg + e(t)
fit1 = Arima(N[,1], order=c(1,0,0), xreg=N[,2], include.constant=FALSE)

>Series: N[, 1]  
>ARIMA(1,0,0) with zero mean         
>         ar1  N[, 5]  
>       0.0860  0.5606  
>s.e.   0.1401  0.1155  
>sigma^2 estimated as 0.6675:  log likelihood=-60.85
>AIC=127.69   AICc=128.22   BIC=133.4

# Build the fitted values: x(t) = a*x(t-1) + b*xreg 
pred2  = fit1$coef[1]*lag(fit1$x, -1) + fit1$coef[2]*fit1$xreg 
pred21 = fitted(fit1) 
head(cbind(pred2, pred21))

>              pred2     pred21
>[1,]         NA  0.4041670
>[2,]  0.4013329 -0.4112205
>[3,] -0.4296032  0.4325201
>[4,]  0.4410005 -1.2037229
>[5,] -1.1936161  0.5792684
>[6,]  0.6462336  1.2911169

In this case, pred2 and pred21 do not match, and the only thing changed was adding an xreg. The only time I cannot build out the fitted values by hand is when the AR part is included. I was able to do it when only MA parts were included with the xreg. I would really appreciate knowing how Arima treats xreg when generating the fitted values.

Best Answer

You have misunderstood the model. It is not $$ y_t = ay_{t-1} + bx_t + e_t$$ as you assume. Rather it is \begin{align} y_t & = bx_t + n_t \\ n_t &= a n_{t-1} + e_t. \end{align} This is explained in the help file for arima:

If an xreg term is included, a linear regression (with a constant term if include.mean is true and there is no differencing) is fitted with an ARMA model for the error term.

There is further discussion comparing these two models on my blog at

Note: You appear to be using the forecast package, although this is not loaded in your code.

