So I make the forecast
$x_{t+1}=0.7x_t+ϵ_{t+1}+0.8ϵ_t$.
Now where I get confused is with the future value of the error term
ϵt+1. How can I solve this value? Should I predict the residuals also
or should I use the expected value of ϵt+1, which should be 0. Hope my
question is clear =)
Yes, you are right, but you mean an ARMA(1,1) or ARIMA(1,0,1). So you want to calculate the one-period forecasts of an ARMA(1,1), which is the same as an ARIMA(1,0,1).
the one-period forecast is given by:
$\hat{x}_{t+1|t}=E(\phi_1x_t + \delta + \epsilon_{t+1} + \theta \epsilon_t|x_t,...,x_1)$
In your case it seems, that you have no constant term?
So this gives in your case
$\hat{x}_{t+1|t}=E(0.7 x_t + \epsilon_{t+1} + 0.8 \epsilon_t|x_t,...,x_1)$
Since $E(\epsilon_{t+1})=0$ this gives
$\hat{x}_{t+1|t}=0.7x_t+0.8\epsilon_t$
About the value of ϵt+1. Should I estimate the value of ϵt+1 by
assuming (as in literature normally is assumed) that the noise process
ϵt is normally distributed ϵt ~ iidN(0,σ2ϵ) and then use estimation
techniques (Least squares, Maximum likelihood, Yule-Walker) to
estimate the value for noise process variance σˆ2ϵ and then just
evaluate value for ϵt+1 ~ iidN(0,σˆ2ϵ) from the estimated Gaussian
distribution?
I am not sure: You mean you want to calculate the "variance" of the forecast to get prediction intervals? This is done by calculating the mean squared error. To do the following you have to know the principals of time series analysis, I just give a short summary and application in your case:
The forecast error is given by
$MSE(\hat{x}_{t+I|t})=E((x_{T+I}-\hat{x}_{T+I|T})^2)=(1+\Psi_1^2+...+\Psi_{I-1}^2)\sigma_\epsilon^2$
The $\Psi$ are obtained as follows:
$a(L)x_t=b(L)\epsilon_t$
In your case of an specific ARMA(1,1):
$(1-0.7L)x_t=(1+0.0.8L)\epsilon_t$
This can be written as (look at a time series book):
$(1-0.7L)x_t * (1+ \Psi_1L+ \Psi_2L^2+...)=1+0.8L$
you can solve this and obtain the $\Psi$ values. This is not necessary in this case, since you only need the MSE of the one step ahead prediction. So
$MSE(\hat{x}_{t+I|t})=E((x_{T+I}-\hat{x}_{T+I|T})^2)=(1+\Psi_1^2+...+\Psi_{I-1}^2)\sigma_\epsilon^2$
reduces in case of $I=1$ to
$MSE(\hat{x}_{t+1|t})=\sigma_\epsilon^2$
the $\sigma_\epsilon^2$ can be obtained from your output in R/STATA or whatever you use.
To answer your questions, you basically need to know how the residuals i.e. $e_t$ are calculated in an arma
model. Because then $\hat{X_{t}}=X_{t}-e_{t}$. Let's first generate a fake data ($X_t$) from arima(.5,.6)
and fit the arma
model (without mean):
library(forecast)
n=1000
ts_AR <- arima.sim(n = n, list(ar = 0.5,ma=0.6))
f=arima(ts_AR,order=c(1,0,1),include.mean=FALSE)
summary(f)
Series: ts_AR
ARIMA(1,0,1) with zero mean
Coefficients:
ar1 ma1
0.4879 0.5595
s.e. 0.0335 0.0317
sigma^2 estimated as 1.014: log likelihood=-1426.7
AIC=2859.4 AICc=2859.42 BIC=2874.12
Training set error measures:
ME RMSE MAE MPE MAPE MASE
Training set 0.02102758 1.00722 0.8057205 40.05802 160.1078 0.6313145
Now I create the residuals as follows: $e_1=0$ (since there is no residual at 1) and for $t=2,...,n$ we have: $e_t=X_t-Ar*X_{t-1}-Ma*e_{t-1}$, where $Ar$ and $Ma$ are the estimated auto-regressive and moving average part in above fitted model. Here is the code:
e = rep(1,n)
e[1] = 0 ##since there is no residual at 1, e1 = 0
for (t in (2 : n)){
e[t] = ts_AR[t]-coef(f)[1]*ts_AR[t-1]-coef(f)[2]*e[t-1]
}
Once you find the residuals $e_{t}$, the fitted values are just $\hat{X_{t}}=X_{t}-e_{t}$. So in the following, I compared the first 10 fitted values obtained from R and the ones I can calculate from $e_{t}$ I created above (i.e. manually).
cbind(fitted.from.package=fitted(f)[1:10],fitted.calculated.manually=ts_AR[1:10]-e[1:10])
fitted.from.package fitted.calculated.manually
[1,] -0.4193068 -1.1653515
[2,] -0.8395447 -0.5685977
[3,] -0.4386956 -0.6051324
[4,] 0.3594109 0.4403898
[5,] 2.9358336 2.9013738
[6,] 1.3489537 1.3682191
[7,] 0.5329436 0.5219576
[8,] 1.0221220 1.0283511
[9,] 0.6083310 0.6048668
[10,] -0.5371484 -0.5352324
As you see there are close but not exactly the same. The reason is that when I created the residuals I set $e_{1}=0$. There are other choices though. For example based on the help file to arima
, the residuals and their variance found by a Kalman filter and therefore their calculation of $e_t$ will be slightly different from me. But as time goes on they are converging.
Now for the Ar(1) model. I fitted the model (without mean) and directly show you how to calculate the fitted values using the coefficients. This time I didn't calculate the residuals. Note that I reported the first 10 fitted values removing the first one (as again it would be different depending on how you define it). As you can see, they are completely the same.
f=arima(ts_AR,order=c(1,0,0),include.mean=FALSE)
cbind(fitted.from.package=fitted(f)[2:10],fitted.calculated.manually=coef(f)*ts_AR[1:9])
fitted.from.package fitted.calculated.manually
[1,] -0.8356307 -0.8356307
[2,] -0.6320580 -0.6320580
[3,] 0.0696877 0.0696877
[4,] 2.1549019 2.1549019
[5,] 2.0480074 2.0480074
[6,] 0.8814094 0.8814094
[7,] 0.9039184 0.9039184
[8,] 0.8079823 0.8079823
[9,] -0.1347165 -0.1347165
Best Answer
For your ARMA(1,1) example, you need to assume you know the first $e_{t-1} $(you can fix it to zero). Then you run the model (starting from $X_{2}$ if you first data point is $X_{1}$) with different AR and MA coefficients , you'll obtain several errors series. For each error serie you compute the associated likelihood function. And finally the parameters (AR and MA) which gives you the highest Likelihood are the estimated parameters. Obviously you need to assume a distribution for the errors.
So in short you obtain the AR and MA parameters by testing different values trough a maximization algorithm using the MLE Method.