Regression – How to Interpret Scaled Regression Coefficients When Only Predictors Are Scaled

interpretationregressionregression coefficientsstandardization

I'm running a model with 2 continuous predictors (x1, x2) and 1 continuous outcome variable (y). The results show that both of the slopes are significant, as well as the intercept, with no significant interaction effect. Let's say that my results are something like this:

(intercept):  216.00
x1:           -12.00
x2:            -8.00

Now, for the sake of interpretability, I've decided to standardize them. So I used the scale() function, and my model now has this form:

model.s <- lm(scale(y)~scale(x1)*scale(x2))

with these results:

(intercept):  -0.0123   # It's not significant anymore
x1:           -2.3  
x2:           -1.2   

My questions are:

  1. Why the intercept lost its significance, and if this is normal,
  2. I have scaled all 3 variables, is anything wrong with that?
  3. How can I interpret the intercept in the scaled model?

Regarding the last one, my interpretation is that:

  1. when x1 is at mean(x1) and x2 is at mean(x2), y is 0.0123 SDs away from mean.
  2. when x1 goes up by 1SD, and x2 is at mean(x2), y decreases by -2.3SDs
  3. when x2 goes up by 1SD, and x1 is at mean(x1), y decreases by -1.2SDs

With standardized the predictors, but not the outcome variable:

model.s1 <- lm(y~scale(x1)*scale(x2))

The results are somewhat different, it appears that the significance returns to the intercept and the values are altered:

(intercept):  98 
x1:          -20   
x2:          -17    

My interpretation of these results is:

  1. when x1 is at mean(x1) and x2 is at mean(x2), y is 98
  2. when x1 goes up by 1SD, and x2 is at mean(x2), y decreases by -20 units
  3. when x2 goes up by 1SD, and x1 is at mean(x1), y decreases by -17 units

In other words, I interpret the x1 and x2 in SD terms, while I interpret y in units. Is this interpretation wrong?

Best Answer

What the scale function does in R is answered here. Basically, it both re-scales the mean value to be zero and the standard deviation to be 1. Several points are worth noting

1) If the original variables were not normally distributed (ND), the scaled variables will not be ND either. Conversely, if the original variables are ND, the rescaled distributions will be ND.

2) A regression using scaled values will obviously have a different intercept than the unscaled originals if the original mean values were not zero.

3) If the original variables are distributed symmetrically about their means (and if the mean value is a good measure of location), then the intercept of the scaled, zero centered new variables' regression should be zero (even in the product), but only when everything ($y\&x$'s) is rescaled.

4) What does the scaling mean? Well, by itself, not much. One has to know what the means and standard deviations were to begin with in order to interpret the scaled results. Basically, it adds nothing, and may even complicate matters by introducing variability (think of multiple different time-series scalings on the $x$-axis) of independent variables.

5) Finally, do $y$ versus $model$ correlations for both the unscaled and scaled regressions' models and compare correlation coefficients. That will show no difference if the regression problem is unchanged.

That is, what you are doing is linear transformation. For example,

$\frac{y-\bar{y_1}}{sd_y}=m_{s,x_1}\frac{x_1-\bar{x_1}}{sd_{x_1}}+m_{s,x_2}\frac{x_2-\bar{x_2}}{sd_{x_2}}+b_{s}$

So, just multiply by ${sd_y}sd_{x_1}sd_{x_2}$, collect constant terms and add to $b_{s}{sd_y}sd_{x_1}sd_{x_2}$ to recover the initial $b$, where the initial $m_{x_1}={sd_y}sd_{x_1}sd_{x_2}m_{s,x_1}$, and $m_{x_2}={sd_y}sd_{x_1}sd_{x_2}m_{s,x_2}$

When this is done as a product of independent variables rather than a sum, things get messier still as there is then an $x_1x_2$, as well as $x_1$ and $x_2$ terms. So then it depends what your original equation was (which you did not provide). If the original equation does not have separate $x_1x_2$, as well as $x_1$ and $x_2$ terms, then the transformed equation and the original equation are two different regression problems, and will not have the same $r$-values.

Related Question