Standardized to Unstandardized Coefficients – Conversion Guide

regression coefficients

My goal is to use the coefficients derived by previous research on the subject to predict actual outcomes given a set of independent variables. However, the research paper lists the Beta coefficients and t-value, only. I would like to know if it is possible to convert the standardized coefficients to unstandardized ones.

Would it be useful to convert my unstandardized independent variables to standardized ones to calculate the predicted value? How would I return to an unstandardized predicted value (if that is even possible..)

Added Sample row from paper:

Number of bus route (buslines) | 0.275(Beta) | 5.70*** (t-value)

I am also given this regarding the independent variables:

Number of bus route (buslines) | 12.56(avg) | 9.02(Std) | 1(min) |
53(max)

Best Answer

It sounds like the paper uses a multiple regression model in the form

$$Y = \beta_0 + \sum_i \beta_i \xi_i + \varepsilon$$

where the $\xi_i$ are standardized versions of the independent variables; viz.,

$$\xi_i = \frac{x_i - m_i}{s_i}$$

withe $m_i$ the mean (such as 12.56 in the example) and $s_i$ the standard deviation (such as 9.02 in the example) of the values of the $i^\text{th}$ variable $x_i$ ('buslines' in the example). $\beta_0$ is the intercept (if present). Plugging this expression into the fitted model, with its "betas" written as $\hat{\beta_i}$ (0.275 in the example), and doing some algebra gives the estimates

$$\hat{Y} = \hat{\beta_0} + \sum_i \hat{\beta_i} \frac{x_i - m_i}{s_i}=\left(\hat{\beta_0}-\left(\sum_i\frac{\hat{\beta_i m_i}}{s_i}\right)\right)+\sum_i\left(\frac{\hat{\beta_i}}{s_i}\right)x_i.$$

This shows that the coefficients of the $x_i$ in the model (apart from the constant term) are obtained by dividing the betas by the standard deviations of the independent variables and that the intercept is adjusted by subtracting a suitable linear combination of the betas.

This gives you two ways to predict a new value from a vector $(x_1, \ldots, x_p)$ of independent values:

  1. Using the means $m_i$ and standard deviations $s_i$ as reported in the paper (not recomputed from any new data!), calculate $(\xi_1,\ldots, \xi_p) = ((x_1-m_1)/s_1, \ldots, (x_p-m_p)/s_p)$ and plug those into the regression formula as given by the betas or, equivalently,

  2. Plug $(x_1, \ldots, x_p)$ into the algebraically equivalent formula derived above.

If the paper is using a Generalized Linear Model, you may need to follow this calculation by applying the inverse "link" function to $\hat{Y}$. For example, with logistic regression it would be necessary to apply the logistic function $1/(1 + \exp(-\hat{Y}))$ to obtain the predicted probability ($\hat{Y}$ is the predicted log odds).