Solved – Converting standardized betas back to original variables

centeringpredictorregressionstandard errorstandardization

I realise this is probably a very simple question but after searching I can't find the answer I am looking for.

I have a problem where I need to standardize the variables run the (ridge regression) to calculate the ridge estimates of the betas.

I then need to convert these back to the original variables scale.

But how do I do this?

I found a formula for the bivariate case that

$$
\beta^* = \hat\beta \frac{S_x}{S_y} \>.
$$

This was given in D. Gujarati, Basic Econometrics, page 175, formula (6.3.8).

Where $\beta^*$ are the estimators from the regression run on the standardized variables and $\hat\beta$ is the same estimator converted back to the original scale, $S_y$ is the sample standard deviation of the regressand, and $S_x$ is the sample standard deviation.

Unfortunately the book doesn't cover the analogous result for multiple regression.

Also I'm not sure I understand the bivariate case? Simple algebraic manipulation gives the formula for $\hat\beta$ in the original scale:

$$
\hat\beta=\beta^* \frac{S_y}{S_x}
$$

It seems odd to me that the $\hat\beta$ that were calculated on variables which are already deflated by $S_x$, has to be deflated by $S_x$ again to be converted back? (Plus why are the mean values not added back in?)

So, can someone please explain how to do this for a multivariate case ideally with a derivation so that I can understand the result?

Best Answer

For the regression model using the standardized variables, we assume the following form for the regression line

$$ \mathbb E[Y] =\beta_{0}+\sum_{j=1}^{k}\beta_{j}z_{j}, $$

where $z_{j}$ is the j-th (standardized) regressor, generated from $x_j$ by subtracting the sample mean $\bar x_j$ and dividing by the sample standard deviation $S_j$: $$ z_j = \frac{x_j - \bar{x}_j}{S_j} $$

Carrying out the regression with the standardized regressors, we obtain the fitted regression line:

$$ \hat Y = \hat \beta_0 +\sum_{j=1}^{k} \hat \beta_{j}z_{j} $$

We now wish to find the regression coefficients for the non-standardized predictors. We have

$$ \hat Y = \hat \beta_0 +\sum_{j=1}^{k} \hat \beta_{j}\left(\frac{x_j - \bar{x}_j}{S_j}\right) $$

Re-arranging, this expression can be written as

$$ \hat Y = \left( \hat \beta_0 - \sum_{j=1}^k \hat \beta_j \frac{\bar x_j}{S_j} \right) + \sum_{j=1}^k \left(\frac{\hat \beta_j}{S_j}\right) x_j $$

As we can see, the intercept for the regression using the non-transformed variables is given by $ \hat \beta_0 - \sum_{j=1}^k \hat \beta_j \frac{\bar x_j}{S_j}$. The regression coefficient of the $j$-th predictor is $\frac{\hat \beta_j}{S_j}$.

In the presented case, I have assumed that only the predictors had been standardized. If one also standardizes the response variable, transforming the covariate coefficients back to the original scale is done by using the formula from the reference you gave. We have:

$$ \frac{\mathbb E[Y] - \hat y}{S_y} =\beta_{0}+\sum_{j=1}^{k}\beta_{j}z_{j} $$

Carrying out the regression, we get the fitted regression equation

$$ \hat Y_{scaled} = \frac{\hat Y_{unscaled} - \bar y}{S_y} = \hat \beta_0 +\sum_{j=1}^{k} \hat \beta_{j}\left(\frac{x_j - \bar{x}_j}{S_j}\right), $$

where the fitted values are on the scale of the standardized response. To unscale them and recover the coefficient estimates for the untransformed model, we multiply the equation by $S_y$ and bring the sample mean of the $y$ to the other side:

$$ \hat Y_{unscaled} = \hat \beta_0 S_y + \bar y +\sum_{j=1}^{k} \hat \beta_{j}\left(\frac{S_y}{S_j}\right) (x_j - \bar{x}_j). $$

The intercept corresponding to the model in which neither the response nor the predictors have been standardized is consequently given by $ \hat \beta_0 S_y + \bar y - \sum_{j=1}^k \hat \beta_j \frac{S_y}{S_j}\bar x_j$, while the covariate coefficients for the model of interest can be obtained by multiplying each coefficient with $S_y / S_j$.