Solved – When should I use FDR or Bonferroni in multiple comparisons

anovabonferronimany-categoriesmultiple-comparisons

I have the following situations :

(1) Very large number of variables – say 100 variables and 200 samples (observations). Although level of each variable is just 2 or 3 or 4 max. I am doing linear ANOVA to test the hypothesis.

Say, for the first variable, Ho: l1 = l2 = l3 = l4, here l1, l2,l3, and l4 are means at levels 1 to 4. The hypothesis is tested in similar way at all 100 variables independently (not a multiple regression rather talking one variable at time).

(2) a single variable with a high number of levels (lets say 100 levels)

Now the null hypothesis is single :

Ho: l1 = l2 = l3 = l4= ..... = l100 

In the both situations should I need some short of multiple comparison type of corrections such as the Bonferroni correction or the false discovery rate correction?

Why?

Best Answer

First of all, both situation are the same from statistical and computational view. In order to perform some analysis, factors are encoded as a number of dummy variables (at least in R). So the factor with 100 levels corresponds to 99 binary variables.

ANOVA F-test is an omnibus test. Applying multi-factor ANOVA to the first situation, you are testing whether means in all groups created by all variables are equal. And only if the result is significant and you want to go deeper then you should worry about multiple comparison problem. However, this point already has been worked out in many ways. Instead of using multiple t-tests and correcting $\alpha$ with FDR or Bonferroni method, it is better to use one of the special post-hoc tests (which are designed to take multiple comparison into account), like Tukey's HSD. This wikipedia article contains the discussion about alternatives.

To summarize: when performing single multi-factor ANOVA multiple comparison workaround is not required. When using a proper post-hoc test, multiple comparison workaround is not required as well.

Related Question