Solved – Which confidence interval adjustment should I do when using FDR p valures adjustment

confidence intervalfalse-discovery-ratemultiple-comparisons

I need to do multiple comparison, and I want adjust the p-values by false discovery rate (fdr). However, it is impossible also adjust the confidence intervals by fdr.

What should I do? LSmeans by default gives bonferroni correction. Is it "consistent"(reasonable) to inform p values adjustment by fdr and confidence intervals by bonferroni for a researcher? (Or even none?)

m1 <- lm(Sepal.Length ~ Species, iris)
l1 <- lsmeans::lsmeans(m1, "Species")
summary(pairs(l1), infer = c(TRUE,TRUE), adjust = "fdr")
 contrast               estimate      SE  df lower.CL upper.CL t.ratio p.value
 setosa - versicolor      -0.930 0.10296 147 -1.17933 -0.68067  -9.033  <.0001
 setosa - virginica       -1.582 0.10296 147 -1.83133 -1.33267 -15.366  <.0001
 versicolor - virginica   -0.652 0.10296 147 -0.90133 -0.40267  -6.333  <.0001

Confidence level used: 0.95 
Conf-level adjustment: bonferroni method for 3 estimates 
P value adjustment: fdr method for 3 tests 

In this case, "fdr" and "bonferroni" give the same results, however, when you have more than 3 categories the results start to be different.

Best Answer

FDR testing is implicitly step-down testing — testing certain things only if other tests pass. However, confidence intervals aren’t typically constructed conditionally, so there is no parallel. In general, simultaneous testing procedures gain power by not testing as many things. But simultaneous CIs essentially test everything, hence they enforce the strongest control over the error rate.

I suppose you could proceed to construct CIs conditionally based on the Student-Newman-Keuls (SNK) method, which is known to control the FDR. First, construct a CI for the difference between the largest and smallest mean, using the Tukey adjustment. Proceed to construct CIs for sub-ranges ONLY if the outer range’s CI excludes zero. For each interval, use the Tukey adjustment for the number of means spanned in its sub-range.

This procedure will produce a tree of dependent CIs, but typically it will exclude CIs for some of the pairwise differences. In the extreme case, you end up with only one CI, if the first one contains zero.

You’d have to implement this by hand.

Related Question