Solved – Bootstrap for small sample

bootstrapsmall-sample

Does bootstrap method help for small sample? In my mind, bootstrap is a solution when you don't have belief in a normality assumption. If your data is random enough, it might be convincing to sample from your sample and get an empirical distribution for your statistic.

But if the size of your sample is small, even if you believe your sample is random, can you say bootstrap is the best way for small sample? Would there be a chance that your bootstrap result is biased?

Best Answer

The validity of the bootstrap is based on asymptotic arguments, so there is not much basis to say that bootstrap is the best way for a small sample.

However, there are some situations where there is good reason to prefer bootstrap to usual asymptotic inference for small (or really any) samples. Some uses of bootstrap achieve what is called an asymptotic refinement. This refers to a situation where the bootstrap estimate of the distribution of your estimator converges faster than the usual asymptotic approximation. For example, if you have some asymptotically normal estimate, $\sqrt{n}(\hat{\theta} - \theta) \leadsto_d N(0,\sigma^2)$, then $$P( \sqrt{n}(\hat{\theta}-\theta)/\hat{\sigma} < z) = \Phi(z) + O(n^{-1}),$$ i.e. the size distortion of hypothesis tests decreases at rate $1/n$. Some versions of bootstrap can be shown to have size distortions that decrease at a rate faster than $1/n$. Specifically, using bootstrap to compute the distribution of a pivotal statistic (one whose asymptotic distribution is completely known, like the t-statistic) generally gives this faster convergence. In this example if for each bootstrap replicate of the data, you calculate $t_b = \sqrt{n}(\hat{\theta}_b - \hat{\theta})/\hat{\sigma}_b$, and let $\hat{F}$ be the CDF of the $t_b$, then
$$P( \sqrt{n}(\hat{\theta}-\theta)/\hat{\sigma} < z) = \hat{F}(z) + o(n^{-1})$$ Here we have a little $o$ where before there was a big $O$. This gives some hope that when bootstrapping pivotal statistics, the bootstrap might be more accurate for small samples than the usual asymptotic approximations.

Related Solutions

Solved – Determining sample size necessary for bootstrap method / Proposed Method

I took interest in this question because I saw the word bootstrap and I have written books on the bootstrap. Also people often ask "How many bootstrap samples do I need to get a good Monte Carlo approximation to the bootstrap result?" My suggested answer to that question is to keep increasing the size until you get convergence. No one number fits all problems.

But that is apparently not that question you are asking. You seem to be asking what the original sample size needs to be for the bootstrap to work. First of all I do not agree with your premise. The basic nonparametric bootstrap assumes that the sample is taken at random from a population. So for any sample size $n$ the distribution for samples chosen at random is the sampling distribution assumed in bootstrapping. The bootstrap principle says that choosing a random sample of size $n$ from the population can be mimicked by choosing a bootstrap sample of size $n$ from the original sample. Whether or not the bootstrap principle holds does not depend on any individual sample "looking representative of the population". What it does depend on is what you are estimating and some properties of the population distribution (e.g., this works for sampling means with population distributions that have finite variances, but not when they have infinite variances). It will not work for estimating extremes regardless of the population distribution.

The theory of the bootstrap involves showing consistency of the estimate. So it can be shown in theory that it works for large samples. But it can also work in small samples. I have seen it work for classification error rate estimation particularly well in small sample sizes such as 20 for bivariate data.

Now if the sample size is very small---say 4---the bootstrap may not work just because the set of possible bootstrap samples is not rich enough. In my book or Peter Hall's book this issue of too small a sample size is discussed. But this number of distinct bootstrap samples gets large very quickly. So this is not an issue even for sample sizes as small as 8. You can take a look at these references:

My book: Bootstrap Methods: A Guide for Practitioners and Researchers
Hall's book: The Bootstrap and Edgeworth Expansion

Solved – Can bootstrap be seen as a “cure” for the small sample size

I remember reading that using the percentile confidence interval for bootstrapping is equivalent to using a Z interval instead of a T interval and using $n$ instead of $n-1$ for the denominator. Unfortunately I don't remember where I read this and could not find a reference in my quick searches. These differences don't matter much when n is large (and the advantages of the bootstrap outweigh these minor problems when $n$ is large), but with small $n$ this can cause problems. Here is some R code to simulate and compare:

simfun <- function(n=5) {
    x <- rnorm(n)
    m.x <- mean(x)
    s.x <- sd(x)
    z <- m.x/(1/sqrt(n))
    t <- m.x/(s.x/sqrt(n))
    b <- replicate(10000, mean(sample(x, replace=TRUE)))
    c( t=abs(t) > qt(0.975,n-1), z=abs(z) > qnorm(0.975),
        z2 = abs(t) > qnorm(0.975), 
        b= (0 < quantile(b, 0.025)) | (0 > quantile(b, 0.975))
     )
}

out <- replicate(10000, simfun())
rowMeans(out)

My results for one run are:

     t      z     z2 b.2.5% 
0.0486 0.0493 0.1199 0.1631

So we can see that using the t-test and the z-test (with the true population standard deviation) both give a type I error rate that is essentially $\alpha$ as designed. The improper z test (dividing by sample standard deviation, but using Z critical value instead of T) rejects the null more than twice as often as it should. Now to the bootstrap, it is rejecting the null 3 times as often as it should (looking if 0, the true mean, is in the interval or not), so for this small sample size the simple bootstrap is not sized properly and therefore does not fix problems (and this is when the data is optimally normal). The improved bootstrap intervals (BCa etc.) will probably do better, but this should raise some concern about using bootstrapping as a panacea for small sample sizes.

Best Answer

Related Solutions

Solved – Determining sample size necessary for bootstrap method / Proposed Method

Solved – Can bootstrap be seen as a “cure” for the small sample size

Related Question