Solved – Sliding window for time series modelling

machine learningmodelingpredictive-modelssequence analysistime series

I am modelling on an univariate time series in a form as shown. Suppose the time interval in the series is daily base, namely every y was collected every day.

I wanna use sliding window method to model this but a key point is that my task is to predict a future y in a 120-day time window, i.e. given all historical data by the time lag t, the model needs to predict y(t+120).

In my understanding, the sliding window methods should be in a way: in the training set, use y(i) as input and y(i+1) as output, iteratively constructed the sample in this way to form the training set, then train the model to predict one step ahead (or multi-steps).

But in my case, I just cared about the status of y after 120 days. But I don't feel confident to predict y(n+120) from y(n) and go n steps ahead. It would be out of control to some extent.

So I planned to construct the training set in a way that: input y(1) output y(121), input y(2) output y(122), … and so on, then once the model was trained, I could input the latest y status say y(n) and expect the model output y(n+120).

Could someone please advice if my method make sense or how can I revise my methodology to continue my modelling. Highly appreciated.

enter image description here

Best Answer

Your method of building a model to directly predict 120 steps ahead makes sense. (How accurate the forecast will be will depend on your data, of course.)

Alternatively, you could look into classical time series forecasting methods, e.g., Exponential Smoothing. Forecasting: Principles and Practice by Hyndman & Athanasopoulos is a good reference.

Think about whether you have some kind of seasonality in your daily data: intra-yearly, intra-weekly or even intra-monthly. The book suggests a few tools for this, like seasonplot. ARIMA or similar would be useful if you only have weekly seasonality. If you have yearly seasonality, ARIMA etc. may run into problems - in such a case, you may want to look at TBATS models, or search for "daily forecast" on CV.