Solved – Rolling forecasts: training versus forecast accuracy evaluation

forecastingmoving windowtime seriestrain

Questions:

  1. Are rolling forecast examples (like the ones below) only useful for evaluating a model's accuracy, or can a rolling forecast be used to train a model?
  2. Are models trained using a rolling forecast generally more accurate?
  3. Can anyone point out an example of a model being trained using a rolling window/rolling forecast technique and forecasted horizons in to the future? By that I mean forecasted horizons beyond the training/testing data used in the rolling forecast.

Examples:
http://robjhyndman.com/hyndsight/tscvexample/
http://robjhyndman.com/hyndsight/rolling-forecasts/

Code:

library("fpp")

h <- 5
train <- window(hsales,end=1989.99)
test <- window(hsales,start=1990)
n <- length(test) - h + 1
fit <- auto.arima(train)
fc <- ts(numeric(n), start=1990+(h-1)/12, freq=12)
for(i in 1:n)
{  
  x <- window(hsales, end=1989.99 + (i-1)/12)
  refit <- Arima(x, model=fit)
  fc[i] <- forecast(refit, h=h)$mean[h]
}

Best Answer

  1. For a given functional form (e.g. for a given order of ARIMA model), estimating the model using all available data is more efficient than estimating it on a subset of the data. This holds if the data is generated by a process that does not change in time. If, on the other hand, the data generating process itself evolves over time, "old" data may be unrepresentative for a "late" period in the sample, and "new" data may be unrepresentative for an "early" period in the sample. Then discounting or completely dropping early observations may help capture the recent state of the data generating process, which should be useful for forecasting the yet-unobserved data. In other words, rolling windows may come in handy. They may also help assess whether a model estimated on an "early" subsample continues to deliver stable forecasting performance throughout the rest of the sample. If it does not (e.g. the performance worsens with time), it is an indication that the data generating process may be evolving over time.
  2. See 1. for a theoretical argument. I cannot offer empirical evidence, though.
  3. I think this strategy would be more relevant for model selection (e.g. selection of the AR and MA orders in an ARMA model) rather than estimation of a model that has a fixed functional form (e.g. fixed ARMA orders). This is because you would like to use all available data for estimating the model once its functional form has been selected. (Omitting some data is generally inefficient.)
Related Question