I am running multiple linear regression with categorical variables and I need confidence interval 95% for standardized regression coefficient. I searched around and found 2 methods:
-
Using the
QuantPsyc
package, with the functionlm.beta
. However, usinglm.beta
I can only get the standardized coefficients whereas I need with their 95% CI too. Is there a way? -
To extract standardized regression coefficient, first standardize all the variables involved, and then run it in linear regression then you'll get estimates for standardized coefficients.
So here is my model:
model1 <- lm(Life_Satisfaction ~ Subjective + Age + Sex + CountryCat11 +
CountryCat12 + CountryCat13 + CountryCat14 +
CountryCat15 + CountryCat16 + CountryCat17 +
CountryCat18 + CountryCat19 + CountryCat20 +
CountryCat23 + CountryCat25 + CountryCat28 +
CountryCat29 + CountryCat30 + Education_ISCED1 +
Education_ISCED2 + Education_ISCED3 +
Education_ISCED4 + Education_ISCED5 +
Education_ISCED6 + Education_stillinschool +
Education_None + Education_other, data=lifesat)
lm.beta (model1)
I ran that, but I cannot get the 95% CI.
So I tried the scale method:
model2 <- lm(scale(Life_Satisfaction) ~ scale(Subjective) + scale(Age) +
scale(Sex) + scale(CountryCat11) +
scale(CountryCat12) + scale(CountryCat13) +
scale(CountryCat14) + scale(CountryCat15) +
scale(CountryCat16) + scale(CountryCat17) +
scale(CountryCat18) + scale(CountryCat19) +
scale(CountryCat20) + scale(CountryCat23) +
scale(CountryCat25) + scale(CountryCat28) +
scale(CountryCat29) + scale(CountryCat30) +
scale(Education_ISCED1) + scale(Education_ISCED2) +
scale(Education_ISCED3) + scale(Education_ISCED4) +
scale(Education_ISCED5) + scale(Education_ISCED6) +
scale(Education_stillinschool) + scale(Education_None) +
scale(Education_other), data=lifesat)
summary(model2)
I ran that, and I got the standardized regression and 95% CI but it was different from the standardized regression results I got from SPSS? Did I do it wrong?
Best Answer
For simplicity, assume that there is one focal continuous predictor $x$ and a continous outcome $y$. Standardization doesn't really make a lot of sense with categorical predictors, imo. The regression model could include more predictors but the following answer focuses only on one of them. Then, we have four possibilities:
Further, let $s_x$ and $s_y$ be the standard deviations of $x$ and $y$, respectively.
In the following section, I'm going to show how to convert the regression coefficients from the standardized models (cases 1-3) to the coefficient in the unstandarized model (case 4) and vice versa. The crucial thing to note is that the same conversion formulas can be applied for converting standard errors and/or confidence limits! An illustration of case 1 in
R
is at the bottom of this answer.Case 1: Both $y$ and $x$ are standardized
To convert from $\beta$ to $\beta_{xy}$ without running another model: $\beta_{xy} = \beta\cdot \frac{s_x}{s_y}$.
To convert from $\beta_{xy}$ to $\beta$ without running another model: $\beta = \beta_{xy}\cdot \frac{s_y}{s_x}$.
To answer your first question: Calculate the regression model with no standardized variables. Multiply the confidence limits for the regression coefficients with $\frac{s_x}{s_y}$.
Case 2: Only $x$ is standardized
To convert from $\beta$ to $\beta_{x}$ without running another model: $\beta_{x} = \beta\cdot s_x$.
To convert from $\beta_{x}$ to $\beta$ without running another model: $\beta = \beta_{x}\cdot \frac{1}{s_x}$.
Case 3: Only $y$ is standardized
To convert from $\beta$ to $\beta_{y}$ without running another model: $\beta_{y} = \beta\cdot \frac{1}{s_y}$.
To convert from $\beta_{y}$ to $\beta$ without running another model: $\beta = \beta_{y}\cdot s_y$.
Here is a short illustration in
R
for the first case. The focal predictor isFertility
:For the confidence intervals, we use the same conversions: