Solved – How to decide moving window size for time series prediction

cross-validationmoving windowpredictive-modelstime series

I have a model to predict +1 day ahead of this time series.

enter image description here

Looking at the chart you can notice some seasonality every 5 days. I suspect using a moving window as training set could help me making a better prediction.

However I want to programmatically find the best Moving Window Size for my model. Are these approaches below valid? Should I do something different?

Approach 1. I run the model on the historical data, with any possible Window Size, I pick the window size that minimises the prediction error. This approach is simple and fast, but I am afraid it overfits the Window Size to historical that. Right?

enter image description here

Approach 2. I use cross-validation (LOOCV) to get a more realistic prediction error. Is this better/worse than Approach 1?

enter image description here

Best Answer

I would say that your first approach seems like a good start, it seems better to me than your second one. Your assessment of the possible risks, are correct as you could interpret this as tweaking hyperparameters on the test set come with the risk with a performance estimate is too optimistic. It could be an idea to tweak your first approach to include a validation set on which you can tweak the window sizes and then only use the test set to obtain a performance estimate. In case you are unfamiliar with this, I quite like what is discussed in this thread: What is the difference between test set and validation set?

Hope this helps!

Related Question