This question is similar to the following question in the sense I am currently doing the differencing and mean removal of the time series outside the Arima
function in R. And I do not know how to do these steps within Arima
function in R. The reason is that I am trying to perform the following procedure (data dowj_ts
can be found at the bottom):
dowj_ts_d1 <- diff(dowj_ts) # differencing at lag 1 (1-B)
drift <- mean(diff(dowj_ts))
dowj_ts_d1_demeaned <- dowj_ts_d1 - mean(dowj_ts_d1) # mean removal
# Maximum Likelihood AR(1) for the mean-corrected differences X_t
fit <- Arima(dowj_ts_d1_demeaned, order=c(1,0,0),include.mean=F, transform.pars = T)
Note that the drift
is actually 0.1336364
. And summary(fit)
gives the table below:
Series: dowj_ts_d1_demeaned
ARIMA(1,0,0) with zero mean
Coefficients:
ar1
0.4471
s.e. 0.1051
sigma^2 estimated as 0.1455: log likelihood=-35.16
AIC=74.32 AICc=74.48 BIC=79.01
Training set error measures:
ME RMSE MAE MPE MAPE MASE
Training set -0.004721362 0.381457 0.2982851 -9.337089 209.6878 0.8477813
ACF1
Training set -0.04852626
Ultimately, I want to predict 2-step ahead forecast of the original series, and this starts to become ugly:
tail(c(dowj_ts[1], dowj_ts[1] + cumsum(c(dowj_ts_d1_demeaned,forecast.Arima(fit,h=2)$mean) + drift)),2)
And currently these are all done outside the Arima
function from the forecast
package. I know I can do differencing within Arima like this:
Arima(dowj_ts, order=c(1,1,0),include.drift=T,transform.pars = F)
This gives:
Series: dowj_ts
ARIMA(1,1,0) with drift
Coefficients:
ar1 drift
0.4478 0.1204
s.e. 0.1059 0.0786
sigma^2 estimated as 0.1474: log likelihood=-34.69
AIC=75.38 AICc=75.71 BIC=82.41
But the drift term computed by R is different from the drift = 0.1336364
that I computed manually.
So my question is: how can I differenced the series and then remove the mean of the differenced series within the Arima function ?
Second question: Why is the drift term estimated by Arima
different from the drift term I computed ? In fact, what does the mathematical model look like when include.drift = T
? This really confuses me.
Data can be found below:
structure(c(110.94, 110.69, 110.43, 110.56, 110.75, 110.84, 110.46,
110.56, 110.46, 110.05, 109.6, 109.31, 109.31, 109.25, 109.02,
108.54, 108.77, 109.02, 109.44, 109.38, 109.53, 109.89, 110.56,
110.56, 110.72, 111.23, 111.48, 111.58, 111.9, 112.19, 112.06,
111.96, 111.68, 111.36, 111.42, 112, 112.22, 112.7, 113.15, 114.36,
114.65, 115.06, 115.86, 116.4, 116.44, 116.88, 118.07, 118.51,
119.28, 119.79, 119.7, 119.28, 119.66, 120.14, 120.97, 121.13,
121.55, 121.96, 122.26, 123.79, 124.11, 124.14, 123.37, 123.02,
122.86, 123.02, 123.11, 123.05, 123.05, 122.83, 123.18, 122.67,
122.73, 122.86, 122.67, 122.09, 122, 121.23), .Tsp = c(1, 78,
1), class = "ts")
Best Answer
The code
is fine. You should be able to call
forecast
on this without a problem.The reason your drift estimate is different is because
Arima
uses the method of maximum likelihood. Your sample mean is not the maximum likelihood estimate of this parameter. There is no closed form expression for the MLE estimates of the parameters. They have to be found using an iterative algorithm.