Solved – Calculating the standard errors of the standardized regression coefficients from an article

confidence intervaleffect-sizemultiple regressionregression coefficientsstandard error

For a systematic review I want to calculate a confidence interval around the standardized regression coefficients for a multiple linear regression model, e.g. using the following effect size calculator: https://www.danielsoper.com/statcalc/calculator.aspx?id=26

One of the articles reports the unstandardized regression coefficients with their standard errors and also the standardized regression coefficients, but not their standard errors.

Is there any way, without having access to the original data, to obtain the standard errors that belong to the standardized regression coefficients? Sample size, R-Squared, the number of predictors are all given.

Best Answer

Let $\beta$ denote the unstandardized coefficient and $\beta^\star$ the standardized coefficient. The relationship between the standardized and unstandardized coefficient is as follows $$ \beta^\star = \beta\cdot\frac{s_x}{s_y} $$

where $s_x$ and $s_y$ denote the standard deviation of the respective predictor and the dependent variable, respectively. This assumes that both $x$ and $y$ have been standardized before the regression. The crucial thing is this: This relationship also holds for the standard error and confidence limits (see my answer here). Because you have both $\beta$ and $\beta^\star$, you can calculate the conversion factor $\frac{s_x}{s_y}$ and apply it to the standard error. Hence, to calculate the standard error for the standardized coefficient apply the following conversion $$ \mathrm{SE}^\star = \mathrm{SE}\cdot\frac{\beta^\star}{\beta} $$

If the confidence limits for the unstandardized coefficient are given, you can apply this conversion directly to the limits to get the CI for the standardized coefficient.

Here is a quick example in R to check:

# Models
mod_unstand <- lm(Infant.Mortality~Fertility + Agriculture, data = swiss)
mod_fully_stand <- lm(scale(Infant.Mortality)~scale(Fertility) + scale(Agriculture), data = swiss)

# Save coefficients and standard errors
beta <- coef(mod_unstand)[2]
se_beta <- summary(mod_unstand)$coefficients[2, 2]

beta_star <- coef(mod_fully_stand)[2]
se_beta_star <- summary(mod_fully_stand)$coefficients[2, 2]

# Apply the conversion formula and compare it to the 
# standard error for thestandardized coefficient

se_beta*(beta_star/beta) # Conversion
se_beta_star # Directly from the regression

scale(Fertility) 
       0.1420434
[1] 0.1420434

Related Solutions

Solved – How to get 95% CIs for standardized regression coefficients

For simplicity, assume that there is one focal continuous predictor $x$ and a continous outcome $y$. Standardization doesn't really make a lot of sense with categorical predictors, imo. The regression model could include more predictors but the following answer focuses only on one of them. Then, we have four possibilities:

Both $y$ and $x$ are standardized (meaning both have mean $0$ and standard deviation $1$). Denote the regression coefficient of $x$ as $\beta_{xy}$.
Only $x$ is standardized. Denote the regression coefficient of $x$ as $\beta_{x}$.
Only $y$ is standardized. Denote the regression coefficient of $x$ as $\beta_{y}$.
Neither $y$ or $x$ are standardized. Denote the regression coefficient of $x$ as $\beta$.

Further, let $s_x$ and $s_y$ be the standard deviations of $x$ and $y$, respectively.

In the following section, I'm going to show how to convert the regression coefficients from the standardized models (cases 1-3) to the coefficient in the unstandarized model (case 4) and vice versa. The crucial thing to note is that the same conversion formulas can be applied for converting standard errors and/or confidence limits! An illustration of case 1 in R is at the bottom of this answer.

Case 1: Both $y$ and $x$ are standardized

To convert from $\beta$ to $\beta_{xy}$ without running another model: $\beta_{xy} = \beta\cdot \frac{s_x}{s_y}$.
To convert from $\beta_{xy}$ to $\beta$ without running another model: $\beta = \beta_{xy}\cdot \frac{s_y}{s_x}$.

To answer your first question: Calculate the regression model with no standardized variables. Multiply the confidence limits for the regression coefficients with $\frac{s_x}{s_y}$.

Case 2: Only $x$ is standardized

To convert from $\beta$ to $\beta_{x}$ without running another model: $\beta_{x} = \beta\cdot s_x$.
To convert from $\beta_{x}$ to $\beta$ without running another model: $\beta = \beta_{x}\cdot \frac{1}{s_x}$.

Case 3: Only $y$ is standardized

To convert from $\beta$ to $\beta_{y}$ without running another model: $\beta_{y} = \beta\cdot \frac{1}{s_y}$.
To convert from $\beta_{y}$ to $\beta$ without running another model: $\beta = \beta_{y}\cdot s_y$.

Here is a short illustration in R for the first case. The focal predictor is Fertility:

# Standard deviations
sx <- sd(swiss$Fertility)
sy <- sd(swiss$Infant.Mortality)

# Models
mod_unstand <- lm(Infant.Mortality~Fertility + Agriculture, data = swiss)
mod_fully_stand <- lm(scale(Infant.Mortality)~scale(Fertility) + scale(Agriculture), data = swiss)

coef(mod_unstand)[2]

Fertility 
0.1166856

# Convert unstandardized coefficient of "Fertility" to a fully standardized one
0.11668557*(sx/sy)

[1] 0.50043

# Check
coef(mod_fully_stand)[2]

scale(Fertility) 
         0.50043

For the confidence intervals, we use the same conversions:

# Confidence interval for the unstandardized coefficient
confint(mod_unstand)[2, ]

     2.5 %     97.5 % 
0.04993591 0.18343524

# Convert the confidence limits from the unstandardized model to a full standardized model
confint(mod_unstand)[2, ]*(sx/sy)

    2.5 %    97.5 % 
0.2141604 0.7866996

# Check
confint(mod_stand)[2, ]

    2.5 %    97.5 % 
0.2141604 0.7866996

Solved – Determine the original standard error of the intercept in a regression estimated with standardized covariates

Let $x_i, i=1,\ldots,p$ denote your unscaled predictors. Then the scaled predictors are $z_i = (x_i-m_i)/s_i$, where $m_i$ and $s_i$ are the sample mean and standard deviations of $x_i$, respectively. The linear predictor in the intended model is $$\alpha + \sum_{i=1}^p\beta_i x_i = \alpha + \sum_{i=1}^p\beta_i (s_i z_i + m_i) = (\alpha+\sum_{i=1}^p \beta_i m_i) + \sum_{i=1}^p (s_i\beta_i) z_i$$ Let $\alpha^*$ and $\beta^*_i$ denote the coefficients of the scaled model. From the above we have $$\beta_i^* = s_i\beta_i$$ $$\alpha^* = \alpha+\sum_{i=1}^p \beta_i m_i$$ So if you use the scaled variables as predictors, their coefficients get multiplied by $s_i$ (note that we did not scale $y$, so there is no division by $s_y$), and that operation is easy to "undo". Solving for the original coefficients we get $$\beta_i = \beta_i^*/s_i$$ $$\alpha_i = \alpha_i^* - \sum_{i=1}^p (m_i/s_i)\beta_i^*$$ The original intercept is a linear combination of the coefficients of the model with the scaled predictor. To get its standard error, we have to be careful to take into account the covariance matrix of the coefficient estimates. Using matrix notation, let $C=[1, -m_1/s_1, \ldots,-m_p/s_p]$ be the row vector of weights in the linear combination, and let $V(\hat{b^*})$ be the covariance matrix of the estimates of the model with the scaled predictors $b^*=[\alpha^*, \beta_1^*,\ldots,\beta_p^*]'$. Then $$\hat\alpha = Cb^*, \quad SE(\hat\alpha) = C V(\hat{b^*}) C'$$ In R, you can use, for example the glht function from the multcomp package to do the calculation for you.

Here is an example how this would work with a simple linear model. The only difference with glmer is that you would use fixef instead of coef to extract the model coefficients $\hat{b^*}$, and you would not need to use the df option in glht, because you would use a normal approximation anyway.

> require(mvtnorm)
> set.seed(462627)
> N <- 50
> x <- rmvnorm(N, mean = c(1, -2), sigma = matrix(c(4, 3, 3, 9), nr=2))
> a <- -2; b <- c(0.5, 2)
> y <- rnorm(N, mean= a+x %*% b, sd=1)
> 
> # fit model with unscaled predictors
> mod1 <- lm(y ~ x)
> summary(mod1)

Call:
lm(formula = y ~ x)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.44223 -0.60469  0.01115  0.48777  2.73533 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -1.80351    0.19575  -9.213 4.21e-12 ***
x1           0.35977    0.07559   4.760 1.89e-05 ***
x2           2.03471    0.04348  46.795  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.8495 on 47 degrees of freedom
Multiple R-squared:  0.9853,    Adjusted R-squared:  0.9846 
F-statistic:  1570 on 2 and 47 DF,  p-value: < 2.2e-16

> 
> # fit model with scaled predictors
> z <- scale(x)
> mod2 <- lm(y ~ z)
> summary(mod2)

Call:
lm(formula = y ~ z)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.44223 -0.60469  0.01115  0.48777  2.73533 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  -3.7206     0.1201  -30.97  < 2e-16 ***
z1            0.6573     0.1381    4.76 1.89e-05 ***
z2            6.4619     0.1381   46.80  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.8495 on 47 degrees of freedom
Multiple R-squared:  0.9853,    Adjusted R-squared:  0.9846 
F-statistic:  1570 on 2 and 47 DF,  p-value: < 2.2e-16

> 
> # calculate intercept from mod1 using mod2
> m <- attr(z, "scaled:center")
> s <- attr(z, "scaled:scale")
> weights <- c(1, -m/s)
> # by hand
> (int <- coef(mod2) %*% weights)
          [,1]
[1,] -1.803515
> (se.int <- sqrt(weights %*% vcov(mod2) %*% weights))
          [,1]
[1,] 0.1957483
> 
> #using glht
> require(multcomp)
> summary(glht(mod2, linfct = rbind(weights), df=mod1$df.residual))

     Simultaneous Tests for General Linear Hypotheses

Fit: lm(formula = y ~ z)

Linear Hypotheses:
             Estimate Std. Error t value Pr(>|t|)    
weights == 0  -1.8035     0.1957  -9.213 4.21e-12 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported -- single-step method)