Solved – Let’s talk sales forecasts – integrating a time series model with subjective “predictions/ leads” from sales team

forecastingtime series

I've learned a lot about time series forecasting this previous year, but one thing that's still a bit lacking in terms of a formal system is integrating a future sales projection into an existing time series model.

I'm hoping to put this out as a general question to all models, but also reference my specific model to help nail down questions.

Say I'm using a Triple exponential smoothing model with a dampened trend to forecast sales— so there's a level, trend, and seasonal component, as well as dampening factor.

An exploration of the data shows that the best-fit trend, supposedly, is one with a .97 dampening factor, and virtually 0 beta — meaning that the trend (growth) is essentially not affected by recent data at all — is in effect constant, and gradually shrinking. Our business growth is essentially slowing over the years — but I'm not sure if a 0.0 beta trend is reasonable for a model — a sudden up-shot would affect the level, but not the trend whatsoever.

Anyway back to the question at hand — so for simplicity sake, this "master sales forecast" is basically an aggregate forecast consisting of, say, 30 clients. Not too uncommon.

I've considered doing a forecast for individual clients as well, but we only need to predict the >total sales< and nothing else. Individual forecasts may aid in that end, but in reality we have 500+ clients making up 10% of our business, and ensuring accurate, sane exponential smoothing models even for our top 20 clients is — well a bit dubious. Though I'm open to the idea.

Here's where my confusion comes in …. sales has prediction for growth for EXISTING clients as well as NEW clients. Say, they predict current client abc will have 30k in growth for June 2015. And there will be new client xyz with 50k in revenue for Aug 2015.

There are a few major problems with integrating this into a time series model.

  1. Of course, sales predictions might simply be wrong or biased. In my case, they are usually too high. Of course, I can correct this by multiplying them by 0.9 or 0.7 or what have you.

  2. It's difficult to ascertain WHEN a new client will come on. They say June, that might mean August, or September … or who knows? It might mean never. How do I account for this in the time series model? If they say a $30k/ month client will come on in June 2015, with a 'maybe' probability, do I put a probability % that this is true for each month multiplied by that new revenue and bias factor? Say, 10% this happens in June, 20% in July, 30% in August … 50% by december? The probability would be increasing because June requires a June or before sign-on, whereas december has a larger success window, Dec or before sign-on.

  3. The third problem is integrating this "expected current client growth" and "Expected new client growth" with the Triple Exponential Smoothing model …. isn't this model ALREADY boosting the level and trends based on sales growth? Like … simply slapping the "expected growth" on top of the ETS model can be "double-counting" it.

Like say I have my ETS with a September forecast, plus this sales "add-on" value. Come July, we have a HUGE sales spike with expected clients, the ETS model adapts upward, now the 'add-on' for September is — well the growth is being counted twice.

See what I mean? Is the answer somehow with decomposing the model more?

I'm just wondering how this all is accomplished.

Surely my problem — predicting sales numbers …. is not unique in the slightest.

Best Answer

Your idea of multiplying by .7 or .9 is in fact what is happening down below in my answer of using a regression type model.

You can take the historical forecasts and the future forecasts from the top 30 sales clients and use it to run a regression model. Now, don't think you can just do it in Excel and then you have an answer. You need to consider some other things.

The approach that you are using now is assuming a model form. Box-Jenkins methodology tries to identify the model. During this process, there are warts like outliers and changes in level/trend/seasonality/parameters/variance that might need to be addressed. Sound complicated? Yes, it is and just imagine trying to do that with a regression model. The thing is that when people move from using one time series to regression they like to pretend that those warts don't exist and that they are Alice in Wonderland. Don't forget the possible need for ARIMA model in this too. the ultimate goal is to have random errors proving that your model has captured the signal/model behavior.

While you may have learned a lot in forecasting, it sounds like you are an SAP customer that is being forced to play with parameters.

If the estimates are always high then the regression coefficient might be .7 meaning that the forecasts are high by a factor of(1/.7)1.42 and the model and forecast will adjust them downwards automatically. If the relationship has changed over time then the thing I mentioned above about changes in parameters using the Chow test will come in handy so that it deletes the older data under the old regime and use only the new regime (ie better estimate from 30 clients).

Related Question