In the Arima package, using a Box-Cox transformation give wrong results when later applied to the forecast method.

For example, consider this data:

```
library(forecast)
data<-c(2,3,2,3,2,3)
```

And for the sake of simplicity, consider an ARIMA(0,0,0) model. (The mean of this series is 2.5.)

The mean forecast made without a Box-Cox transformation is correct:

```
forecast(Arima(data,order=c(0,0,0)))$mean
[1] 2.5 2.5 2.5 2.5 2.5 2.5 2.5 2.5 2.5 2.5
```

However if we use a Box-Cox transformation, such as a log transformation with lambda=0, the "mean" forecast is wrong:

```
forecast(Arima(data,order=c(0,0,0), lambda=0))$mean
[1] 2.44949 2.44949 2.44949 2.44949 2.44949 2.44949 2.44949 2.44949 2.44949 2.44949
```

It seems that to produce the mean forecast of Y=exp(X), it does E(Y)=exp(E(X)).

Is there a way to correct this?

Is there a package with a correct implementation of forecasts with Box-Cox transformations?

## Best Answer

The approach adopted by the

`forecast`

package is to fit a model to the transformed data and to backtransform both the point forecasts and prediction intervals. This is not "wrong". It is a legitimate choice. This preserves the coverage of the prediction intervals, and the back-transformed point forecast can be considered themedianof the forecast densities (assuming the forecast densities on the transformed scale are symmetric). For many purposes, this is acceptable, but occasionally the mean forecast is required. For example, with hierarchical forecasting the forecasts need to be aggregated, and medians do not aggregate but means do.It is easy enough to derive the mean forecast using a Taylor series expansion. Suppose $f(x)$ represents the back-transformation function, $\mu$ is the mean on the transformed scale and $\sigma^2$ is the variance on the transformed scale. Then using the first three terms of a Taylor expansion around $\mu$, the mean on the original scale is given by $$f(\mu) + \frac{1}{2}\sigma^2f''(\mu).$$

For a Box-Cox transformation, $$f(x) = \left\{\begin{array}{ll} (\lambda x+1)^{1/\lambda} & \text{if $\lambda\ne0$;}\\ e^x & \text{if $\lambda=0$.}\end{array}\right. $$ So $$f''(x) = \left\{\begin{array}{ll} (1-\lambda)(\lambda x+1)^{1/\lambda-2} & \text{if $\lambda\ne0$;}\\ e^x & \text{if $\lambda=0$.}\end{array}\right. $$ and the backtransformed mean is given by $$\left\{\begin{array}{ll} (\lambda \mu+1)^{1/\lambda}\left[1 + \frac{\sigma^2(1-\lambda)}{2(\lambda \mu+1)^{2}}\right] & \text{if $\lambda\ne0$;}\\ e^\mu\left[1 + \frac{\sigma^2}{2}\right] & \text{if $\lambda=0$.}\end{array}\right. $$ Therefore, to adjust the back-transformed mean obtained by R, the following code can be used.

An extended version of this answer is given on my blog.