Solved – Normalized regression coefficients – interpretation

normalizationregression

I have data containing several variables. I ran a regression model. Prior to running the model I have normalized the dependent variable Y and the independent variables X1 and X2.

After receiving the output I want to interpret the results. For example, if the coefficient of X1 is 0.15, I know that it means that for addition of one standard deviation of X1, there is an increase of 0.15 standard deviations in Y, but this is not clear.

I want to go back to the original units of Y and X for interpretation. How do I do that ? Can I simply take the "normalized coefficients", multiply by the standard deviation of Y and add the mean of Y?

Something about it doesn't make me comfortable.

Best Answer

Firstly, why did you normalized Y? It will make your output harder to interpret, and it is often not necessary to standardize the dependent variable.

I presume you have centered and scaled you X's, you can backtransform them to interpret,

        # run your model with the X's standardized

        mean <- mean(x1)
        sd <- sd(x1)

    b1*(x1-mean)/sd 

#you can also plot

plot(y~x1)
curve(b0 + b1*(x1-mean)/sd, add=TRUE)

I also recommend to use Y at its original scale, since standardization does not change the distribution shape.

Related Solutions

Solved – Interpretation regression intercept when only numerical predictors are standardized

In a regression equation, the coefficient for the intercept is the estimated value of the outcome when all of the predictors are equal to zero.

In a case like this where continuous predictors have been standardized but binary predictors have been coded as 0 and 1, the intercept represents the estimated value of the outcome for the mean value of the continuous predictor(s) and whatever level of the binary predictor(s) is coded as 0.

To make it a little more concrete, imagine I have two predictors, gender (male: 0, female:1) and age (ranging from 20-40, mean of 30). If I standardise age, and fit the regression model and get the following coefficients

predictor beta
Intercept 5
Gender    10
Age       1

then the intercept tells us that for a male of average age (i.e. 30), the estimated outcome would be 5 ($5 + 0\times10 + 0\times1$). For a female of average age, the estimated outcome would be 15 ($5 + 1\times10 + 0\times1$).

As the number of predictors increase, interpretation may become a little more difficult, but the basic point remains. The intercept is the estimated value of $y$ when all $x$s are zero.

Regression – How to Interpret Scaled Regression Coefficients When Only Predictors Are Scaled

What the scale function does in R is answered here. Basically, it both re-scales the mean value to be zero and the standard deviation to be 1. Several points are worth noting

1) If the original variables were not normally distributed (ND), the scaled variables will not be ND either. Conversely, if the original variables are ND, the rescaled distributions will be ND.

2) A regression using scaled values will obviously have a different intercept than the unscaled originals if the original mean values were not zero.

3) If the original variables are distributed symmetrically about their means (and if the mean value is a good measure of location), then the intercept of the scaled, zero centered new variables' regression should be zero (even in the product), but only when everything ($y\&x$'s) is rescaled.

4) What does the scaling mean? Well, by itself, not much. One has to know what the means and standard deviations were to begin with in order to interpret the scaled results. Basically, it adds nothing, and may even complicate matters by introducing variability (think of multiple different time-series scalings on the $x$-axis) of independent variables.

5) Finally, do $y$ versus $model$ correlations for both the unscaled and scaled regressions' models and compare correlation coefficients. That will show no difference if the regression problem is unchanged.

That is, what you are doing is linear transformation. For example,

$\frac{y-\bar{y_1}}{sd_y}=m_{s,x_1}\frac{x_1-\bar{x_1}}{sd_{x_1}}+m_{s,x_2}\frac{x_2-\bar{x_2}}{sd_{x_2}}+b_{s}$

So, just multiply by ${sd_y}sd_{x_1}sd_{x_2}$, collect constant terms and add to $b_{s}{sd_y}sd_{x_1}sd_{x_2}$ to recover the initial $b$, where the initial $m_{x_1}={sd_y}sd_{x_1}sd_{x_2}m_{s,x_1}$, and $m_{x_2}={sd_y}sd_{x_1}sd_{x_2}m_{s,x_2}$

When this is done as a product of independent variables rather than a sum, things get messier still as there is then an $x_1x_2$, as well as $x_1$ and $x_2$ terms. So then it depends what your original equation was (which you did not provide). If the original equation does not have separate $x_1x_2$, as well as $x_1$ and $x_2$ terms, then the transformed equation and the original equation are two different regression problems, and will not have the same $r$-values.

Best Answer

Related Solutions

Solved – Interpretation regression intercept when only numerical predictors are standardized

Regression – How to Interpret Scaled Regression Coefficients When Only Predictors Are Scaled

Related Question