Solved – Why not always use bootstrap CIs

bootstrapconfidence intervalnormality-assumptionresampling

I was wondering how bootstrap CIs (and BCa in barticular) perform on normally-distributed data. There seems to be lots of work examining their performance on various types of distributions, but could not find anything on normally-distributed data. Since it seems an obvious thing to study first, I suppose the papers are just too old.

I did some Monte Carlo simulations using the R boot package and found bootstrap CIs to be in agreement with exact CIs, although for small samples (N<20) they tend to be a bit liberal (smaller CIs). For large enough samples, they are essentially the same.

This makes me wonder whether there is any good reason not to always use bootstrapping. Given the difficulty of assessing whether a distribution is normal, and the many pitfalls behind this, it seems reasonable not to decide and report bootstrap CIs irrespective of the distribution. I understand the motivation for not using non-parametric tests systematically, since they have less power, but my simulations tell me this is not the case for bootstrap CIs. They are even smaller.

A similar question that bugs me is why not always use the median as the measure of central tendency. People often recommend to use it to characterize non normally-distributed data, but since the median is the same as the mean for normally-distributed data, why make a distinction? It would seem quite beneficial if we could get rid of procedures for deciding whether a distribution is normal or not.

I am very curious about your thoughts on these issues, and whether they have been discussed before. References would be highly appreciated.

Thanks!

Pierre

Best Answer

It is beneficial to look at the motivation for the BCa interval and it mechanisms (i.e. the so called "correction factors"). The BCa intervals are one of the most important aspects of the bootstrap because they are the more general case of the Bootstrap Percentile Intervals (i.e. the confidence interval based solely upon the bootstrap distribution itself).

In particular, look at the relationship between BCa intervals and the Bootstrap Percentile Intervals: when the adjustment for acceleration (the first "correction factor") and skewness (the second "correction factor") are both zero, then the BCa intervals revert back to the typical Bootstrap Percentile Interval.

I do not think that it would be a good idea to ALWAYS use bootstrapping. Bootstrapping is a robust technique that has a variety of mechanisms (ex: confidence intervals and there are different variations of the bootstrap for different types of problems such as the wild bootstrap when there is heteroscedasticity) for adjusting for different problems (ex: non-normality), but it is relies upon one crucial assumption: the data accurately represent the true population.

This assumption, although simple in nature, can be difficult to verify especially in the context of small sample sizes (it could be though that a small sample is an accurate reflection of the true population!). If the original sample on which the bootstrap distribution (and hence all of the results that follow from it) is not adequately accurate, then your results (and hence your decision based upon those results) will be flawed.

CONCLUSION: There is a lot of ambiguity with the bootstrap and you should exercise caution before applying it.

Related Question