Time Series Forecasting Strategies for 2000 Products

clusteringforecastingtime series

First of all, I realise that my question is very broad and that it may be hard to answer this question because of it.

Do you have any advice on how to approach a 'problem' where you need to make forecasts/predictions for 2000+ different products? In other words, each product requires a different forecast/prediction. I have 2 years of historical data on week level (i.e. demand per week per product).

I need to do this in a short time period: I have about a week to do this, hence I am looking for ways that I can quickly make relatively good prediction models. Creating a model for each product and inspecting its performance closely, one by one, would be too time-consuming.

I thought of segmenting the products based on the variance, so that I can employ simple models for products that have a low variance. While this is probably not ideal, it would be a quick way to narrow down the number of models I need to create.

It would be greatly appreciated if you have any practical advice for me on approaching this problem.

Best Answer

A follow up to @StephanKolassa 's answer:

  • I concur with Stephan that ETS() from the forecast package in R is probably your best and fastest choice. If ETS doesn't give good results, you might want also want to use Facebook's Prophet package (Auto.arima is easy to use, but two years of weekly data is bordering not enough data for an ARIMA model in my experience). Personally I have found Prophet to be easier to use when you have promotions and holiday event data available, otherwise ETS() might work better. Your real challenge is more of a coding challenge of how to efficiently iterate your forecasting algorithm over a large number of time series. You can check this response for more details on how to automate forecast generation.

  • In demand forecasting, some form of hierarchical forecasting is frequently performed, i.e you have 2000 products and you need a separate forecast for each separate product, but there are similarities between products that might help with the forecasting. You want to find some way of grouping the product together along a product hierarchy and then use hierarchical forecasting to improve accuracy. Since you are looking for forecasts at the individual product level, look at trying the top-down hierarchical approach.

  • Something a little bit more farfetched, but I would like call it out: Amazon and Uber use neural networks for this type of problem, where instead of having a separate forecast for each product/time series, they use one gigantic recurrent neural network to forecast all the time series in bulk. Note that they still end up with individual forecasts for each product (in Uber's case it is traffic/demand per city as opposed to products), they are just using a large model (an LSTM deep learning model) to do it all at once. The idea is similar in spirit to hierarchical forecasting in the sense that the neural network learns from the similarities between the histories of different products to come up with better forecasts. The Uber team has made some of their code available (through the M4 competition Github repositories), however it is C++ code (not exactly the favorite language of the stats crowd). Amazon's approach is not open source and you have to use their paid Amazon Forecast service to do the forecasts.


With regards to your second comment: You need to differentiate between forecasting sales and forecasting demand. Demand is unconstrained, if suddenly an item is popular and your customers want 200 units, it doesn't matter that you have only 50 units on hand, your demand is still going to be 200 units.

In practice it is very difficult to observe demand directly, so we use sales as proxy for demand. This has a problem because it doesn't account for situations where a customer wanted to purchase a product but it was unavailable. To address it, along with the historical sales data, information about inventory levels and stock outs is either directly included in a model or used to preprocess the time series prior to generating a model for forecasting.

Typically an unconstrained forecast is generated first by a forecast engine and then passed on to a planning system which then adds the constrains you mention (i.e demand is 500 units but only 300 units are available) along with other constraints (safety stock, presentation stock, budgetary constraints, plans for promotions or introductions of new products etc...) - however this falls under the general rubric of planning and inventory management, not forecasting per se.

Related Question