Solved – Training one model to work for many time series

datasetforecastingpanel datatime series

I have been working with time series data to try and make multi-step demand forecast for products. There are thousands of products and it is computationally very expensive and labour intensive to tune a single model for each product.

As far as I can see there are a couple of realistic options at my disposal:

  1. Try and group 'similar' products together. Based on their time series they do not look correlated but perhaps there is some way to cluster time series data of varying lengths? I tried something using dynamic time warping, but when I had a manageable number of clusters (10-20) the series looked very disimilar. I don't know if there is a standard way to cluster time series data or whether there is some kind of guideline on when a cluster becomes to disimilar? If this works then manually tune a model for each cluster.

  2. Train a model (maybe a neural network or LSTM) on all the different time series at the same time, with the hope that this model would then be capable of producing 'good' predictions per time series fed to it.

Is there some sort of methodology for training a model to make predictions on many (seemingly unrelated) time series data? Most of the literature I have read concerns itself with producing a model for one time series instead of a more general model. I understand when forecasting the assumption is made that the model is able to "mimic" the function that generated the existing data so it is very difficult to have a multipurpose model. But there must be some kind of resolution or general accepted way of working with many different time series data?

Best Answer

Yes, there are ways of doing this. You could apply some kind of meta learning to adapt the learning process to each separate time series, or use transfer learning to transfer the knowledge learned from one series to another. I don't have pointers, since this is certainly not the first thing I would do, see below.

You could also try calculating seasonal indices to groups of products and deseasonalize them all together, then apply simpler non-seasonal models to the deseasonalized series. A simple paper on this is "The Application of Product-Group Seasonal Indexes to Individual Products" by Mohammadipour, Boylan & Syntetos, Foresight, 2012. A similar process should also work for other drivers, like trend, calendar events or promotions.


Alternatively, do consider fitting simple models to all your series, e.g., exponential smoothing. This will fit extremely quickly. Alternatively, invest a little time in some feature engineering and consider a very simple linear model - see Varmerdam's PyData presentation on the benefits of simple models; he even discusses time series models. If nothing else, the simpler models will serve as a useful benchmark. After you have invested one day in training the simple models and two weeks in meta and transfer learning the more complex ones, you may very well find that the simple models outperformed the more complex ones. (And that they are easier to interpret and communicate, and to maintain in production.)

Related Question