Solved – What’s the difference between standardized and unstandardized coefficients in linear regression models

regressionstandardization

When doing a linear regression model, one calculates the coefficients called Beta and B, they are the values that display how much a dependent variable changes, depending on a predictor (or independent) variable.

What's the difference between standardized and unstandardized coefficients in linear regression models?

Best Answer

I guess you mean the following. Consider the simple linear regression model: $$Y_i = \beta_0 + \beta_1 X_i + \epsilon_i \qquad \epsilon_i \stackrel{d}{=} N(0,\sigma^2)$$.

One would estimate the $\beta_1$ with an Least-squares estimator $\hat \beta_1$, which expresses the expected amount $Y_i$ changes when $X_i$ increases by one.

This usually works (and is the easiest in terms of interpretation) but you could fit the model in some other ways

First method - centering the variables

You could center the variable, for instance use $X_i^\ast = X_i - \overline X$ and fit the model once more, this would result in $$Y_i = \beta_0^\ast + \beta_1^*X_i^* + \epsilon_i^* \qquad \epsilon_i^\ast \stackrel{d}{=} N(0, \sigma^2)$$

then $\beta_1^\ast$ should be read as the expected increase in $Y_i$ when $X_i$ increases one from the mean. Using centered variables usually improves the model (it reduces multicollinearity). Also the value of $\beta_0^\ast$ is always usefull, it is the expected value of $Y_i$ hen $X_i = \overline X$.

Second method - standardizing the variables

You could standardizing the variables as wel, then $X_i^\ast = \dfrac{X_i-\overline X}{s_{X}}$, and refit the model.

The model then gives you the expected change in $Y_i$ when $X_i$ increases one standard deviation.

This has some numerical advantages.

Third method - standardizing both the variables and the outcome

Must usefull in multiple linear regression. Say you want a model like: $$Y_i = \beta_0 + \beta_1 X_{i1}+ \beta_2 X_{i2} + \epsilon_i \qquad \epsilon_i \stackrel{d}{=} N(0,\sigma^2)$$

How would you compare $\beta_1$ to $\beta_2$, in other words, which variable $X_1$ or $X_2$ has the most effect?

You could standardize the variables and the $Y$ values with $Y_i^\ast = \dfrac{Y_i - \overline Y}{s_Y}$ and the predictors as before. The $\beta_i^\ast$ which is the greatest (in absolute value) would point to the predictor which has the biggest effect.

Related Solutions

Regression – Difference Between Regression Coefficients and Partial Regression Coefficients

"Partial regression coefficients" are the slope coefficients ($\beta_j$s) in a multiple regression model. By "regression coefficients" (i.e., without the "partial") the author means the slope coefficient in a simple (only one variable) regression model. If you have multiple predictor / explanatory variables, and you run both a set of simple regressions, and a multiple regression with all of them, you will find that the coefficient for a particular variable, $X_j$, will always differ between its simple regression model and the multiple regression model, unless $X_j$ is pairwise orthogonal with all other variables in the set. In that case, $\hat\beta_{j\ {\rm simple}} = \hat\beta_{j\ {\rm multiple}}$. For a fuller understanding of this topic, it may help you to read my answer here: Is there a difference between 'controlling for' and 'ignoring' other variables in multiple regression?

R Multiple Regression – Using Scale() Function for Standardized Coefficients in Linear Models

I'm not sure that standardized coefficients make much sense when you have dummy variables.

The idea of a standardized coefficient is that it puts the units of the predictor variable into a form that we understand. I have a sense of what a standard deviation is, whereas if I don't know what time is (or know if its measured in seconds, minutes, hours, days, weeks), I can't interpret the units.

If you've got a factor, your measures are dummy coded - you have diet2, or you have diet1. I know what that scale means. Andrew Gelman suggest that instead of dividing by the SD, we divide by 2 SDs, and this makes the effect of a continuous variable comparable to the effect of a dummy coded variable. Paper here: http://www.stat.columbia.edu/~gelman/research/published/standardizing7.pdf , blog entry here: http://andrewgelman.com/2006/06/21/standardizing_r/

Anyway, what you do is not quite the right way, because you don't want standardized coefficients for dummy (factor) variables. But as long as you describe them appropriately, it's OK.

If you really want to, you can standardized the variables before you do the analysis, and get truly standardized coefficients. These will be kind of meaningless though:

ChickWeight$d2 <- scale(ChickWeight$Diet == 2)
ChickWeight$d3 <- scale(ChickWeight$Diet == 3)
ChickWeight$d4 <- scale(ChickWeight$Diet == 4) 
bb <- lm(scale(weight) ~ scale(Time) + d2 + d3 + d4, data=ChickWeight   )