I am running into some problems with data transformations I am doing as part of a time series model I am building.
I am doing the following transformations in the following order on my target variable: (1) box-cox , (2) trend differencing and (3) 0-1 scaling. I am running these transformations in reverse order on the resulting predictions.
I am finding that I frequently have negative numbers going into the reverse of the box -cox transformation, which is causing my application to crash. I have tried adding a constant to my target variable prior to the above transformations and then subtracting the constant from the predictions after transformations – but still see the same problem.
Is there a way to handle this?
(I am aware of the Yeo–Johnson transformation – could that be my answer?)
import numpy as np
import pandas as pd
from scipy.special import inv_boxcox
from sklearn.preprocessing import MinMaxScaler
def target_untransform_fun(series, helper):
# step 1: reverse min-max scaling
if helper[3] is not None:
scaler = helper[3]
series = pd.DataFrame(series)
series = scaler.inverse_transform(series)
series = pd.DataFrame(series)
series = pd.Series(series.iloc[:,0])
# step 2: reverse stationarity transformation
series_atlevel = series.copy() # setting initial value for the series this function will return
# do nothing if we already have an at level forecast
if helper[1] == ('None' or 'Failed'):
pass
else:
# adjustment for forecast based on first differenced input
if helper[1] == '1 Step Differencing':
series_atlevel.iloc[0] = helper[2][1] + series.iloc[0]
for i1 in range(1,len(series)):
series_atlevel.iloc[i1] = series_atlevel.iloc[i1-1] + series.iloc[i1]
else:
# adjustment for forecast based on second differenced input
if helper[1] == '2 Step Differencing':
series_atlevel.iloc[0] = 2*helper[2][1] - helper[2][0] + series.iloc[0]
series_atlevel.iloc[1] = 2*series_atlevel.iloc[0] - helper[2][1] + series.iloc[1]
for i2 in range(2,len(series)):
series_atlevel.iloc[i2] = 2*series_atlevel.iloc[i2-1] - series_atlevel.iloc[i2-2] + series.iloc[i2]
else:
# adjustment for forecast based on log transform
if helper[1] == 'Log Transform':
series_atlevel = np.exp(series).dropna()
series = series_atlevel
# step 3: reverse boxcox
if helper[0][0] == 'yes':
series = inv_boxcox(series, helper[0][1])
series = series - helper[0][3] # subtract constant to prevent negative values
return series
Best Answer
As you can see from the illustration at https://stats.stackexchange.com/a/467525/919, the Box-Cox transformation can have negative values and is invertible--when properly defined.
The best definition is that when $x$ is a positive number and $\lambda$ is any number, the Box-Cox transformation (with parameter $\lambda$) is
$$x\to \operatorname{BC}(x;\lambda) = \int_1^x y^{\lambda-1}\mathrm{d}y = \left\{\begin{aligned}\frac{x^\lambda-1}{\lambda},& \quad \lambda \ne 0 \\ \log x,& \quad \lambda=0.\end{aligned}\right.$$
The rules of algebra permit us to compute its inverse as
$$\operatorname{BC}^{-1}(y;\lambda) = \left\{\begin{aligned}(\lambda y + 1)^{1/\lambda},&\quad \lambda\ne 0\\ \exp(y),&\quad \lambda=0.\end{aligned}\right.$$
The $1/\lambda$ power is understood to be the unique $\lambda^\text{th}$ root given by
$$(\lambda y+1)^{1/\lambda} = \exp\left(\frac{\log(\lambda y+1)}{\lambda}\right).$$
Notice that since $x \gt 0,$ $x^\lambda-1 \gt -1,$ whence for positive $\lambda$ the Box-Cox value $y$ must exceed $-1/\lambda.$ (Look at the left ends of the red, black, and blue curves in the figure corresponding to $\lambda = 2, 1, 1/2.$) Consequently the expression $\lambda y+1$ in the formula for the inverse is guaranteed to be positive provided $y$ is a possible value of the Box-Cox transform of some positive number.
If you are using positive $\lambda$ and obtain $y$ values of $-1/\lambda$ or smaller in your analysis, and you believe that needs to be back-transformed, then you are using an invalid statistical procedure. Typically, applications of the Box-Cox transform should not yield values anywhere near $-1/\lambda.$ As the illustration shows, that means the transformation is being applied at extreme values where it is applying the most extreme local changes, indicating your procedure is highly sensitive to both the choice of parameter and the values of $x.$ Consider that a helpful flag of an incorrect assumption behind your analysis and proceed from there.