Time Series Data – Choosing Inner Cross Validation Strategy for Modeling Time Series Data

cross-validationforecastingmodelingpredictive-modelstime series

We know that forward chaining a.k.a. time series cross validation is more appropriate than standard CV techniques in a time-series dataset.

However, there's relatively little discussion around the choice of inner CV loop of time series data when trying to evaluate the model's expected accuracy.

Generally speaking, what type of cross-validation is appropriate for the inner cross validation (e.g. hyperparam selection) that occurs? Should this also always be done in a forward chaining manner for best results?

Best Answer

I do not quite see why the time series cross validation (TSCV) technique/design should depend on whether it is used for training the model or for evaluating its performance. But perhaps I am ignorant of something?

One rather simple and easy-to-use TSCV technique is the use of rolling windows. If we have a sample of $T$ observations, we may estimate the model using a window of $T_1<T$ consecutive observations and test or evaluate the model's performance by examining how well the model predicts the subsequent one or more observations for each window. So if you have a sample of 100, you could take

  • 1 though 70 as the first rolling window,
  • 2 though 71 as the second rolling windows,
  • ...,
  • 30 through 99 as the last rolling window,

and assess the predictive accuracy for observations 71, 72, ..., 100, respectively. This is just an example, the proportions of training and testing as well as the forecast horizons could be varied. Rob J. Hyndman provides an illustration in his blog post "Time series cross-validation: an R example".

However, there are alternatives. For example, the standard standard $K$-fold CV may be sensible even for time series data in certain setups. This is discussed in detail in Bergmeir et al. "A Note on the Validity of Cross-Validation for Evaluating Time Series Prediction" (working paper).

Related Question