Solved – Holt-Winters Forecasting – Why do we use most recent estimate for all projections going forward

exponential-smoothingforecastingtime series

I've been doing some research on using the Holt-Winters method for forecasting and understand all but one aspect.

Why do we use the most recent estimate for the base and trend components for all projections going forward? Does the data lose some integrity staying locked on only one week, and only using seasonality to account for change?

I've googled relentlessly but cant find the rationale.
I want to be able to explain this if questions arise, so any insight would be great!

This video at 10:00 is what my question is geared at…
https://youtu.be/qpiWJaeJPtA?t=10m

Excel Snapshot

Best Answer

I assume we're not dealing with the multiplicative form.

The reason we use the most recent estimates for the level and trend is because of the way the model is set up -- in effect, it corresponds to an assumption that's part of the model.

It's easiest to see if you look at the model in error-correction form (see, for example, Sec 7.5 of Hyndman & Athanasopoulos, Forecasting Principles and Practice):

\begin{align*} \ell_{t} &= \ell_{t-1} + b_{t-1}+\alpha e_{t}\\ b_{t} &= b_{t-1} + \alpha \beta^*e_{t}\\ s_{t} &= s_{t-m}+\gamma e_t. \end{align*} where $e_t=y_{t} - (\ell_{t-1} + b_{t-1}+s_{t-m})=y_{t} - \hat{y}_{t|t-1}$

The components there being what I'll call current level (where we'd expect the seasonally-corrected series to be), slope (the expected change in that level from time period to time period) and seasonal.

(It sounds like your source is working in terms of a baseline level rather than the current level, but you should be able to translate the intuition from this formulation back to that one.)

Starting with the line for $b$, imagine we're sitting at time $t-1$, trying to forecast $t$. The model for how the slope ($b$) (rate of change of current level) works tells us that the next $b$ is the current $b$ plus a "jump". The slope term moves as a random walk (making the trend more or less linear in the short term, at least if the innovation variance is small, but not in the long term -- we might call that locally linear).

Note that the future forecast error term $e_t$ is unknown, but is $0$ on average (if the model is correct).

So if our estimates are unbiased, the expected value for $b_t$ is our estimate of $b_{t-1}$.

Similarly, if we're sitting at time $t-1$, our most recent estimate for the current level is $l_{t-1}$. The next one will be the previous one, plus the slope, plus a "jump", (a multiple of the same innovation as moved the trend).

If we remove the effect of the trend over time, to get level in terms of a "baseline" level, by defining baseline level as $d_t=\ell_t-\sum^t b_t$ then the expected baseline level is again the previous one - $E(d_t)=d_{t-1}$.

The expected seasonal component moves in a similar way (but 'seasonally') - it's value is the value from the previous season-cycle plus a "jump".

So it's really just down to the model assumption of Holt-Winters. They're the way they are because the model actually assumes the next one is the relevant 'previous' one plus a zero-mean innovation.

Does the data lose some integrity staying locked on only one week and only using seasonality to account for change?

i) The model has a component to account for the changes over time; those jumps I keep mentioning. They don't affect the expected value until they happen (the model says we don't know which direction those jumps will go, only something about their typical size), but going out into the future they do affect its uncertainty.

ii) as with any model, if the model is badly wrong, the forecasts can't be expected to be right -- but if this model is a reasonable description of the data (and for many series it performs pretty well), then those forecasts are "correct" for that model and so should also work well. No model is likely to be actually "right", but the components of this model are close to how we understand a number of processes to work.

So if in the process for your data the direction of the changes in level and trend could be determined (at least above 'it's completely random'), it wouldn't make sense to use a model that assumed they couldn't.

Related Solutions

Solved – Forecasting beyond one season using Holt-Winters’ exponential smoothing

I am not very familiar with Holt-Winters, however I have this excellent book by @Rob Hyndman. The package forecast (which is based on the book) of statistical package R gives the following result on your data:

> hw<-read.table("~/R/stackoverflow/hw.txt")
> tt<-ts(hw[,3],start=c(1999,1),freq=12)

> aa<-forecast(tt)
> plot(aa)
> summary(aa)

Forecast method: ETS(M,N,A)

Model Information:
ETS(M,N,A) 

Call:
 ets(y = object) 

  Smoothing parameters:
    alpha = 0.1701 
    gamma = 1e-04 

  Initial states:
    l = 870.4847 
    s = -278.0815 -143.6584 151.959 -135.595 514.2527 236.9216
           -32.7679 128.8337 115.0829 47.5922 -234.4105 -370.1288

  sigma:  0.1122

     AIC     AICc      BIC 
1892.756 1896.346 1933.115 

In-sample error measures:
         ME        RMSE         MAE         MPE        MAPE        MASE 
 18.1543007 121.8594668  70.7086492   0.8480306   7.0006920   0.2893504

Here is the graph of the forecast together with the confidence intervals: enter image description here

Note that the function forecast picks automatically the best exponential smoothing model from 30 models which are classified by the type of trend model, seasonal part model and the additivity or multiplicity of error.

The best model found in your data is with multiplicative error, no trend and additive seasonality, which is less complicated model than you are trying to fit. The way function forecast works is however that the more complicated model was considered and rejected in favor the final model.

If you provide the exact formulas it would be possible to fit the precise model to see whether the problem you described is really property of the model.

Solved – Using Holt-Winters for forecasting in Python

I think the R forecast package you mentioned is a better fit for this problem than just using Holt-Winters. The two functions you are interested in are ets() and auto.arima(). ets() will fit an exponential smoothing model, including Holt-Winters and several other methods. It will choose parameters (alpha, beta, and gama) for a variety of models and then return the one with the lowest AIC (or BIC if you prefer). auto.arima() works similarly.

However, as IrishStat pointed out, these kinds of models may not be appropriate for your analysis. In that case, try calculating some covariates, such as dummy variables for weekends, holidays, and their interactions. Once you've specified covariates that make sense, use auto.arima() to find a ARMAX model, and then forecast() to make predictions. You will probably end up with something much better than a simple Holt-Winters model in python with default parameters.

You should also note that both ets() and auto.arima can fit seasonal models, but you need to format your data as a seasonal time series. Let me know if you need any help with that.

You can read more about the forecast package here.

Best Answer

Related Solutions

Solved – Forecasting beyond one season using Holt-Winters’ exponential smoothing

Solved – Using Holt-Winters for forecasting in Python

Related Question