Solved – Interpretation of standardized coeffizients (beta) for interactions in linear regression

rregressionstandardization

I am fitting different regression models in r. I calculate standardized regression coefficients (betas) to estimate the influence of one variable compared to other variables in the model. I get my betas either by using lm.beta on the finished model, or by using scale() on my variables before fitting. The result will normally be the same and between -1 and 1.

Now I stumbled upon standardized regression coefficients (betas) above |1|.
I read e.g. these questions and their answers:

As mentioned there, colinearity can cause betas > |1|. Colinearity will e.g. occur with polynomial terms. In this case orthogonal polynoms (using poly() in r) can solve the problem (but won't give you unstandardized coefficients to set up an equation – discussed here).

Colinearity will also occur when interactions are included. In this case, lm.beta will return betas > |1|. But using scale() befor fitting won't.
So in this case these functions don't come to the same result. But to my disbelief, the order of my coefficients (from least influential – smallest |beta| to most influential – highest |beta|) will change, too! (I know, this might belong into stack overflow – I only mention it to show why I lost my faith in betas.)

This leads to my question: Are betas above |1| a valid way to estimate the size of influence of a variable? Or does colinearity change the explanatory power of betas?


Extra information:

lm_1 <- lm(scale(z) ~ scale(x) * scale(y), data=my_data)  
summary(lm_1)

lm_2 <- lm(z ~ x * y, data=my_data)  
lm.beta(lm_2)

# => betas(lm_1) ≠ betas(lm_2)

my_data
x       y      z
5       1.2137 62.36373
5       1.4554 61.10530
5       1.3055 61.13140
5       1.4063 53.67698
5       1.7703 51.04274
5       1.6428 63.55070
5       1.8371 43.15657
5       1.2690 68.02751
5       1.0688 66.94460
5       1.7006 51.22292
5       1.4104 62.44610
5       0.9719 72.13336
20       1.2477 53.74743
20       1.7939 39.40533
20       1.2164 62.45627
20       2.0238 36.56760
20       2.1101 42.22221
20       1.0962 74.33377
20       1.9728 45.18541
20       1.4349 60.62519
20       0.8630 82.53847
20       1.2026 62.60825
20       1.7915 47.55394
40       2.1199 30.02192
40       1.0481 60.03576
40       1.7281 32.28667
40       1.1551 55.25391
40       1.0704 51.01340
40       1.1502 54.39699
40       1.8935 35.94391
40       1.6328 46.37105
40       1.3036 44.30024
40       0.8852 62.49559
60       0.8111 49.45070
60       2.2221 25.59926
60       1.1936 48.63415
60       1.2311 40.55304
60       1.1662 44.72073
60       1.5028 41.90134
60       1.1842 39.86526
60       2.5178 31.30643
60       1.4953 36.38800

Best Answer

This is an issue of linearizing non-linear problems.  (A strange response, and I'll briefly elaborate before providing an answer.)  We can solve polynomial regression using linear regression because we can treat the higher ordered powers as separate independent variables in the regression model (thus, the model parameters can be estimated by using linear algebra techniques).  The math works just fine.  But the interpretation is another story...

The problem with standardized coefficients in polynomial models or models with interaction terms (which are just different versions of polynomial models) is that your $\beta$ no longer can be interpreted separately.  For example, if you have an independent variable and two higher powers (say $x^2$ and $x^3$), then it doesn't make sense to ask about the influence of $x$ alone (nor does it make sense to ask about the influence of the higher powers separate).  There are two rationals for this:  First, as $x$ changes, the other independent variables ($x^2$ and $x^3$) change in turn.  Thus, if you want to talk about the standardized effect of $x$, you need to include all of the other terms in the description as well.  Second, it is actually fruitless to talk about the impact of the terms separately because the right transformation of $x$ (say, $\tilde{x}=x-c$) may actually make one of the lower ordered terms disappear.

I will stop here, but I am curious if any others have encountered any techniques to "combine" the standardized coefficients for higher ordered powers or interaction terms.  (Perhaps this could be addressed in a more detailed answer or in the comments.)