Solved – How to Handle Many Times Series Simultaneously

arimamachine learningtime series

I have a data set including the demand of several products (1200 products) for 25 periods and I need to predict the demand of each product for the next period. At first, I wanted to use ARIMA and train a model for each product, but because of the number of products and tunning of (p,d,q) parameters, it is so time-consuming and it is not practical. Is it recommended to use a regression where previous demands are independent variables (Autoregressive)?

Can I know if there is any method to train a single model for the demand prediction of all 1200 products? I would be thankful if you can suggest any library in Python because I am using Python.

Best Answer

As Ben mentioned, the text book methods for multiple time series are VAR and VARIMA models. In practice though, I have not seen them used that often in the context of demand forecasting.

Much more common, including what my team currently uses, is hierarchical forecasting (see here as well). Hierarchical forecasting is used whenever we have groups of similar time series: Sales history for groups of similar or related products, tourist data for cities grouped by geographical region, etc...

The idea is to have a hierarchical listing of your different products and then do forecasting both at the base level (i.e. for each individual time series) and at aggregate levels defined by your product hierarchy (See attached graphic). You then reconcile the forecasts at the different levels (using Top Down, Botton Up, Optimal Reconciliation, etc...) depending on the business objectives and the desired forecasting targets. Note that you won't be fitting one large multivariate model in this case, but multiple models at different nodes in your hierarchy, which are then reconciled using your chosen reconciliation method.

enter image description here

The advantage of this approach is that by grouping similar time series together, you can take advantage of the correlations and similarities between them to find patterns (such a seasonal variations) that might be difficult to spot with a single time series. Since you will be generating a large number of forecasts that is impossible to tune manually, you will need to automate your time series forecasting procedure, but that is not too difficult - see here for details.

A more advanced, but similar in spirit, approach is used by Amazon and Uber, where one large RNN/LSTM Neural Network is trained on all of the time series at one. It is similar in spirit to hierarchical forecasting because it also tries to learn patterns from similarities and correlations between related time series. It is different from hierarchical forecasting because it tries to learn the relationships between the time series itself, as opposed to have this relationship predetermined and fixed prior to doing the forecasting. In this case, you no longer have to deal with automated forecast generating, since you are tuning only one model, but since the model is a very complex one, the tuning procedure is no longer a simple AIC/BIC minimization task, and you need to look at more advanced hyper-parameter tuning procedures, such as Bayesian Optimization.

See this response (and comments) for additional details.

For Python packages, PyAF is available but nor very popular. Most people use the HTS package in R, for which there is a lot more community support. For LSTM based approaches, there is Amazon's DeepAR and MQRNN models which are part of a service you have to pay for. Several people have also implemented LSTM for demand forecasting using Keras, you can look those up.

Related Question