I was looking at the mtcars dataset and exploring the relationship between MPG and the transmission modes (auto/manual). I decided to use the following linear models with the regressors specified in the below R code:
> data(mtcars)
> fit <- lm(mpg ~ I(wt - mean(wt)) + I(qsec - mean(qsec)) + factor(am), data = mtcars)
> round(summary(fit)$coeff, 4)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 18.8979 0.7194 26.2707 0.0000
I(wt - mean(wt)) -3.9165 0.7112 -5.5069 0.0000
I(qsec - mean(qsec)) 1.2259 0.2887 4.2467 0.0002
factor(am)1 2.9358 1.4109 2.0808 0.0467
From the above, the P-value of the slope coefficient "factor(am)1" is lower than 0.05 and hence we'd reject the null hypothesis and infer that cars of manual transmission has higher MPG value than those of manual transmission.
However, I also tried to explore the equivalent linear model without intercept terms, as per R code below:
> fit2 <- lm(mpg ~ I(wt - mean(wt)) + I(qsec - mean(qsec)) + factor(am)-1, data=mtcars)
> round(summary(fit2)$coeff, 4)
Estimate Std. Error t value Pr(>|t|)
I(wt - mean(wt)) -3.9165 0.7112 -5.5069 0e+00
I(qsec - mean(qsec)) 1.2259 0.2887 4.2467 2e-04
factor(am)0 18.8979 0.7194 26.2707 0e+00
factor(am)1 21.8338 0.9438 23.1344 0e+00
> confint(fit2)
2.5 % 97.5 %
I(wt - mean(wt)) -5.3733342 -2.459673
I(qsec - mean(qsec)) 0.6345732 1.817199
factor(am)0 17.4244109 20.371471
factor(am)1 19.9005360 23.767021
From the 95% Confidence Interval constructed, a car (of auto transmission) with average wt and qsec has a MPG interval [17.4244, 20.3714], while a car (of manual transmission) with average wt and qsec has a MPG interval [19.9005. 23.7670].
The two Confidence Intervals overlap, and we failed to reject the null hypothesis where statistically there's no difference between the MPG performance of cars (with auto transmission) and MPG performance of cars (with manual transmission).
I used two equivalent linear models and they gave me different conclusions. Could you enlighten me on what I may have missed here?
Best Answer
The models are equivalent. The misconception here is that overlapping confidence intervals do not mean you "failed to reject the null hypothesis." You do not compare the upper/lower bounds of confidence intervals and call it a day. Thomas' comment links1 to a good general explanation of why this is, though it doesn't directly apply to what's happening under the hood in a regression setting2.
Paul has linked to an explanation of how the
lm
t-statistic is calculated, and that test is a test of that coefficient against zero (e.g. $\beta_3 = 0$ and $\beta_4 = 0$). But you want to test $\beta_3 - \beta_4 = 0$.Performing this test will then have the equivalent result to
fit1
- there are quite a fewR
packages that do this type of test - the following usescar
:For reference, this type of test (and more generalized versions of it) are called an absolutely silly number of different names depending on software, book, and field. (general linear hypothesis, linear contrasts, differences of least square means, regression Wald tests, more...), if you want to look up the theory and how the manual calculation is done.
1 imgur mirror of contents
2 This is because the estimated coefficients and their standard errors are not independent, and so you have to take that into account when running the test. If we just run a straight t-test (which is what the theory reduces down to in this case) without taking this into account with the given values results in the incorrect: