Solved – How to get 95% CIs for standardized regression coefficients

categorical dataconfidence intervalrregressionregression coefficients

I am running multiple linear regression with categorical variables and I need confidence interval 95% for standardized regression coefficient. I searched around and found 2 methods:

  1. Using the QuantPsyc package, with the function lm.beta. However, using lm.beta I can only get the standardized coefficients whereas I need with their 95% CI too. Is there a way?

  2. To extract standardized regression coefficient, first standardize all the variables involved, and then run it in linear regression then you'll get estimates for standardized coefficients.

So here is my model:

model1 <- lm(Life_Satisfaction ~ Subjective + Age + Sex + CountryCat11 + 
                                 CountryCat12 + CountryCat13 + CountryCat14 + 
                                 CountryCat15 + CountryCat16 + CountryCat17 + 
                                 CountryCat18 + CountryCat19 + CountryCat20 + 
                                 CountryCat23 + CountryCat25 + CountryCat28 + 
                                 CountryCat29 + CountryCat30 + Education_ISCED1 + 
                                 Education_ISCED2 + Education_ISCED3 + 
                                 Education_ISCED4 + Education_ISCED5 + 
                                 Education_ISCED6 + Education_stillinschool + 
                                 Education_None + Education_other, data=lifesat)

lm.beta (model1)

I ran that, but I cannot get the 95% CI.

So I tried the scale method:

model2 <- lm(scale(Life_Satisfaction) ~ scale(Subjective) + scale(Age) + 
                                        scale(Sex) + scale(CountryCat11) + 
                                        scale(CountryCat12) + scale(CountryCat13) + 
                                        scale(CountryCat14) + scale(CountryCat15) + 
                                        scale(CountryCat16) + scale(CountryCat17) + 
                                        scale(CountryCat18) + scale(CountryCat19) + 
                                        scale(CountryCat20) + scale(CountryCat23) + 
                                        scale(CountryCat25) + scale(CountryCat28) + 
                                        scale(CountryCat29) + scale(CountryCat30) + 
                                    scale(Education_ISCED1) + scale(Education_ISCED2) + 
                                    scale(Education_ISCED3) + scale(Education_ISCED4) + 
                                    scale(Education_ISCED5) + scale(Education_ISCED6) + 
                               scale(Education_stillinschool) + scale(Education_None) + 
                                        scale(Education_other), data=lifesat)

summary(model2)

I ran that, and I got the standardized regression and 95% CI but it was different from the standardized regression results I got from SPSS? Did I do it wrong?

Best Answer

For simplicity, assume that there is one focal continuous predictor $x$ and a continous outcome $y$. Standardization doesn't really make a lot of sense with categorical predictors, imo. The regression model could include more predictors but the following answer focuses only on one of them. Then, we have four possibilities:

  1. Both $y$ and $x$ are standardized (meaning both have mean $0$ and standard deviation $1$). Denote the regression coefficient of $x$ as $\beta_{xy}$.
  2. Only $x$ is standardized. Denote the regression coefficient of $x$ as $\beta_{x}$.
  3. Only $y$ is standardized. Denote the regression coefficient of $x$ as $\beta_{y}$.
  4. Neither $y$ or $x$ are standardized. Denote the regression coefficient of $x$ as $\beta$.

Further, let $s_x$ and $s_y$ be the standard deviations of $x$ and $y$, respectively.

In the following section, I'm going to show how to convert the regression coefficients from the standardized models (cases 1-3) to the coefficient in the unstandarized model (case 4) and vice versa. The crucial thing to note is that the same conversion formulas can be applied for converting standard errors and/or confidence limits! An illustration of case 1 in R is at the bottom of this answer.

Case 1: Both $y$ and $x$ are standardized

To convert from $\beta$ to $\beta_{xy}$ without running another model: $\beta_{xy} = \beta\cdot \frac{s_x}{s_y}$.
To convert from $\beta_{xy}$ to $\beta$ without running another model: $\beta = \beta_{xy}\cdot \frac{s_y}{s_x}$.

To answer your first question: Calculate the regression model with no standardized variables. Multiply the confidence limits for the regression coefficients with $\frac{s_x}{s_y}$.

Case 2: Only $x$ is standardized

To convert from $\beta$ to $\beta_{x}$ without running another model: $\beta_{x} = \beta\cdot s_x$.
To convert from $\beta_{x}$ to $\beta$ without running another model: $\beta = \beta_{x}\cdot \frac{1}{s_x}$.

Case 3: Only $y$ is standardized

To convert from $\beta$ to $\beta_{y}$ without running another model: $\beta_{y} = \beta\cdot \frac{1}{s_y}$.
To convert from $\beta_{y}$ to $\beta$ without running another model: $\beta = \beta_{y}\cdot s_y$.

Here is a short illustration in R for the first case. The focal predictor is Fertility:

# Standard deviations
sx <- sd(swiss$Fertility)
sy <- sd(swiss$Infant.Mortality)

# Models
mod_unstand <- lm(Infant.Mortality~Fertility + Agriculture, data = swiss)
mod_fully_stand <- lm(scale(Infant.Mortality)~scale(Fertility) + scale(Agriculture), data = swiss)

coef(mod_unstand)[2]

Fertility 
0.1166856

# Convert unstandardized coefficient of "Fertility" to a fully standardized one
0.11668557*(sx/sy)

[1] 0.50043

# Check
coef(mod_fully_stand)[2]

scale(Fertility) 
         0.50043

For the confidence intervals, we use the same conversions:

# Confidence interval for the unstandardized coefficient
confint(mod_unstand)[2, ]

     2.5 %     97.5 % 
0.04993591 0.18343524

# Convert the confidence limits from the unstandardized model to a full standardized model
confint(mod_unstand)[2, ]*(sx/sy)

    2.5 %    97.5 % 
0.2141604 0.7866996

# Check
confint(mod_stand)[2, ]

    2.5 %    97.5 % 
0.2141604 0.7866996