Bootstrap – Understanding Subsample Bootstrapping Methods

bootstrap

I have been working on the uncertainty associated with a quantity calculated from a Monte Carlo project. Normally I would use the bootstrap method by resampling with replacement, for a couple of technical reasons that is not particularly easy here. It was suggested that I just break up my MC data set and perform the experiment with these subsets and find the uncertainty that way. I have in the past come across references to bootstrapping with only a subset of the original dataset.

Can someone point me to a tutorial on this or explain briefly how it is different to bootstrapping with replacement and just setting the number of samples to a fraction of the total size. I would be particularly interested in a method that meant that $n$ could be different for each subsample, this would make my analysis much more simple.

Best Answer

There are two methods related to your question. One is the m out of n bootstrap and the other is random subsampling. In his original proposal Efron picked the bootstrap sample size to be the same as the original sample size. There was no specific requirement to do that but the idea was to mimic random sampling from the population as closely as possible. However there are situations where this ordinary bootstrap is inconsistent. Bickel and Ren among others showed that taking a smaller sample size m can lead to consistent results. This works asymptotically with m and n both tending to infinity but at a rate so that m/n goes to 0. Random subsampling was introduced by Hartigan and McCarthy in the late 1960s about a decade before the bootstrap. It uses a procedure of randomly sampling subsets of the original sample. It may be that you could take either of these approaches with your data.

For information on the m out of n bootstrap you can consult either of the following books that I authored/co-authored:

An Introduction to Bootstrap Methods with Applications to R

Bootstrap Methods: A Guide for Practitioners and Researchers

This book by Politis, Romano and Wolf goes into random subsampling in great detail:

Subsampling