Solved – Why are exponential smoothing forecasts exactly the same for the following 5 days in hourly sampled 30 days data? R ets forecast package

exponential-smoothingforecastingrtime series

I would like to use exponential smoothing to forecast for 5 days, but forecasts look all same. I have read the documentation of ets package and tried different Additive, Multiplicative model, but could not fix the problem. My data consists 30 days of hourly measurements and I would like to forecast day from 31 to 35.

Here is my code

library(forecast)

mydatatsfreq <- ts(mydata, frequency = 24)
fit <- ets(mydatatsfreq, model='ZZZ')
summary(fit)

Output of summary

ETS(A,Ad,A)

Call: ets(y = mydatatsfreq, model = "ZZZ")

Smoothing parameters: alpha = 0.9971 beta = 1e-04 gamma = 2e-04 phi = 0.9788

Initial states: l = 6.5994 b = -0.0745 s=-8.5981 -8.3857 -8.2845 -8.4552 -8.5558 -8.6233 -8.662 -6.5815 5.5694 15.1411 20.8226 22.4551 23.014 20.7874 15.5312 7.1746 -3.5179 -8.8709 -8.8073 -8.5763 -8.6457 -8.74 -8.6555 -8.5355

sigma: 1.7493

 AIC     AICc      BIC 
5593.623 5596.326 5730.958

Training set error measures: ME RMSE MAE MPE MAPE MASE ACF1 Training set 0.00722588 1.749286 0.8136336 NaN Inf 0.6419291 0.05326251

This is the plot of forecasts

plot

Results of auto.arima()

Series: mydatatsfreq 
ARIMA(2,0,2)(0,0,2)[24] with non-zero mean 

Coefficients:
         ar1      ar2      ma1      ma2    sma1    sma2    mean
      1.8022  -0.8810  -0.5069  -0.3599  0.4508  0.3917  0.1713
s.e.  0.0190   0.0186   0.0414   0.0397  0.0447  0.0336  0.0056

sigma^2 estimated as 0.002391:  log likelihood=1146.15
AIC=-2276.3   AICc=-2276.1   BIC=-2239.68

Best Answer

ETS is not very useful in my opinion for data such as you describe due to the POSSIBLE presence of anomalies/level shifts/time trends etc. , changing error variability , the need for deterministic seasonal pulses rather than seasonal ARIMA , etc.. There are often hourly effects , day-of-the-week effects and other latent variables to be exploited. Often the hour effect depends on which day-of-the-week . Please post your data and I will try and help you using potentially more powerful procedures than what you are using/trying.

EDITED AFTER RECEIPT OF DATA:

You have 24 values per day for 30 days where non-zero values arise for only 10 of the 24 hours . An ARIMA approach (your approach) is flawed because of the fact that so many values are 0.0 thus creating the impression of strong autocorrelation for short term lags. This is why your forecasts are the SAME. You really have 10 NON-ZERO observations per day for 30 days ( 300 observations) and wish to predict say for the next 5 days (50 values).

Using AUTOBOX automatically (my tool of choice) a reasonable model is here enter image description here . An unusual value was detected at the 26th day hour 2 of 10 via Intervention Detection procedures. A significant reduction in the variance of the errors was detected using the Tsay procedure enter image description here at observation 61 (first reading for the 7th week) thus Weighted Least Squares was employed . A significant upwards shift (level/step ) was detected at week 14 period 4 (134th point of 300) . Finally an AR(1) component was identified and used.

Here is the ACF of the model residuals suggesting sufficiency enter image description here . Here is the plot of the forecasts for the next 5 weeks (50 values) enter image description here . The actual.fit and forecast graph is here enter image description here and here with forecast limits enter image description here

In summary 8 seasonal pulses (deterministic effects for hour of the day modulo 10) were found to be significant and used.

In terms of software , if you have a simple problem , simple tools will suffice. Your problem in my opinion required a fairly comprehensive approach as simple tools failed to characterize the data. Simple tools rarely (never !) deal with complex data.

NEW EDIT:

The model is driven by fundamentally deterministic structure i.e. 1) a constant 2) 8 hourly dummies 3) a level shift 4 ) a pulse indicator PLUS an AR(1) factor of .6 which results in an assymtotic forecast. The hourly dummy variables are dominant resulting in an approximate constant expectation. The forecasts are slightly different at the third decienter image description heremal position.

Related Question