Solved – multiple regression and multiple comparisons

multiple regressionmultiple-comparisons

Say I fit a multiple regression of p explanatory variables. The t-test will allow me to check if any single one of those is significant ($H_0: \beta_i = 0$). I can do a partial F-test to check if some subset of them is significant ($H_0: \beta_i=\beta_j=…=\beta_k=0$).

What I often see though is that someone gets 5 p-values from 5 t-tests (assuming they had 5 covariates) and only keeps the ones with a p-value < 0.05. That seems a bit incorrect as there really should be a multiple comparison check no? Is it really fair to say something like $\beta_1$ and $\beta_2$ are significant but $\beta_3$, $\beta_4$ and $\beta_5$ are not?

On a related note, say I run 2 regressions on 2 separate models (different outcome). Does there need to be a multiple comparison check for significant parameters between the two outcomes?

Edit:
To differentiate from the similar question, is there any other interpretation to the p-values besides: "B_i is (in)significant, when adjusting for all the other covariates"? It doesn't seem that this interpretation allows me to look at every B_i and drop those less than 0.5 (which is similar to the other post).

It seems to me that a sure fire way to test whether B_i and Y have an a relationship would be to get a correlation coefficient p-value for each covariate and then do a multcomp (although that would definitely lose signal).

Finally, say I computed the correlation between B1/Y1, B2/Y1 and B3/Y1 (thus three p-values). Unrelatedly, I also did a correlation between T1/Y2, T2/Y2, T3/Y2. I'm assuming the correct Bonferroni adjustment would be 6 for all 6 tests together (rather than 3 for the first group and 3 for the second group – and thus getting 2 "semi"-adjusted p-values).

Best Answer

You're right. The problem of multiple comparisons exists everywhere, but, because of the way it's typically taught, people only think it pertains to comparing many groups against each other via a whole bunch of $t$-tests. In reality, there are many examples where the problem of multiple comparisons exists, but where it doesn't look like lots of pairwise comparisons; for example, if you have a lot of continuous variables and you wonder if any are correlated, you will have a multiple comparisons problem (see here: Look and you shall find a correlation).

Another example is the one you raise. If you were to run a multiple regression with 20 variables, and you used $\alpha=.05$ as your threshold, you would expect one of your variables to be 'significant' by chance alone, even if all nulls were true. The problem of multiple comparisons simply comes from the mathematics of running lots of analyses. If all null hypotheses were true and the variables were perfectly uncorrelated, the probability of not falsely rejecting any true null would be $1-(1-\alpha)^p$ (e.g., with $p=5$, this is $.23$).

The first strategy to mitigate against this is to conduct a simultaneous test of your model. If you are fitting an OLS regression, most software will give you a global $F$-test as a default part of your output. If you are running a generalized linear model, most software will give you an analogous global likelihood ratio test. This test will give you some protection against type I error inflation due to the problem of multiple comparisons (cf., my answer here: Significance of coefficients in linear regression: significant t-test vs non-significant F-statistic). A similar case is when you have a categorical variable that is represented with several dummy codes; you wouldn't want to interpret those $t$-tests, but would drop all dummy codes and perform a nested model test instead.

Another possible strategy is to use an alpha adjustment procedure, like the Bonferroni correction. You should realize that doing this will reduce your power as well as reducing your familywise type I error rate. Whether this tradeoff is worthwhile is a judgment call for you to make. (FWIW, I don't typically use alpha corrections in multiple regression.)

Regarding the issue of using $p$-values to do model selection, I think this is a really bad idea. I would not move from a model with 5 variables to one with only 2 because the others were 'non-significant'. When people do this, they bias their model. It may help you to read my answer here: algorithms for automatic model selection to understand this better.

Regarding your update, I would not suggest you assess univariate correlations first so as to decide which variables to use in the final multiple regression model. Doing this will lead to problems with endogeneity unless the variables are perfectly uncorrelated with each other. I discussed this issue in my answer here: Estimating $b_1x_1+b_2x_2$ instead of $b_1x_1+b_2x_2+b_3x_3$.

With regard to the question of how to handle analyses with different dependent variables, whether you'd want to use some sort of adjustment is based on how you see the analyses relative to each other. The traditional idea is to determine whether they are meaningfully considered to be a 'family'. This is discussed here: What might be a clear, practical definition for a "family of hypotheses"? You might also want to read this thread: Methods to predict multiple dependent variables.