I am trying to assess the influence of a product change compared to a hypothetical forecast. I use a time series model to predict a realistic future and compare it to the observed actuals. I am interested in the mean absolute and percentual difference between the prediction and the actuals.
How may I correctly estimate confidence intervals in this scenario and account for the specific time series structure of the data?
library(forecast)
library(ggplot2)
train <- subset(AirPassengers, end = 84)
actuals <- subset(AirPassengers, start = 85)
m1 <- ets(train, "AAN")
fc <- forecast(m1, h = 12 * 5)
fc %>%
autoplot() +
autolayer(actuals)
# Current Approach: T-Test for Confidence Intervals
mean(actuals - fc$mean)
#> [1] 60.60125
t.test(actuals, fc$mean) # possibly use paired = TRUE
#>
#> Welch Two Sample t-test
#>
#> data: actuals and fc$mean
#> t = 5.3794, df = 78.309, p-value = 7.533e-07
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#> 38.17509 83.02742
#> sample estimates:
#> mean of x mean of y
#> 396.4333 335.8321
uplift_pct <- (actuals - fc$mean) / fc$mean
mean(uplift_pct)
#> [1] 0.1767661
t.test(uplift_pct)
#>
#> One Sample t-test
#>
#> data: uplift_pct
#> t = 7.4578, df = 59, p-value = 4.535e-10
#> alternative hypothesis: true mean is not equal to 0
#> 95 percent confidence interval:
#> 0.1293381 0.2241941
#> sample estimates:
#> mean of x
#> 0.1767661
Best Answer
This is easily done using the general structure you've provided. Unfortunately, you left out the annual seasonal term, and consequently the algorithm was unable to estimate the seasonal component that contains most of the time series structure.
Note also that for this data the series seems to increase in a nonlinear fashion, as does the variability due to seasonal swings. This indicates that a Box-Cox transform may help.
The following code does what you want, with a little extra added for the Box-Cox transform. The confidence intervals, as plotted, do not contain the additional variability due to the estimation of the Box-Cox parameter $\lambda$, which, as it happens, equals $0$ - i.e., a log transform performed best according to a minimum AIC criterion.
which leads to the following graph:
Clearly this is not a very good forecast at the longer horizons, but in real life we'd probably forecast over a shorter horizon to start with and use out-of-sample predictions to select $\lambda$. In any case, the objective is to show you how to capture the annual structure and the confidence intervals of the forecasts in the model.
Now, on to something more like the question you really asked! The forecast object
fc2
contains, as a default, 80% prediction intervals and 95% prediction intervals in the first and second columns respectively offc2$lower
andfc2$upper
. We can extract these intervals from the forecast object as follows:These are pointwise prediction intervals, in the sense that for each period taken individually there is an $\alpha$% probability of falling outside the interval (given all the assumptions.) However, forecast errors are autocorrelated, so the probability of falling outside the interval in, say, Feb. 1956 is not independent of whether or not we fell outside the interval in Jan. 1956; consequently, a t-test based on the errors over the whole forecast horizon would be biased due to the failure of independence.
Since forecast errors tend to be positively correlated for forecast models that take into account short-term dependencies in the variate of interest, we can apply Hochberg's step-up procedure (see here) for a joint test.
with result:
which in this instance would not reject any time period as having fallen outside the Hochberg-corrected 95% test interval, as the minimum observed p-value is $>$ the Hochberg cutoff level.
There are plenty of other multiple comparison corrections, see for example the Wikipedia page, that may be more or less useful to you.