Bootstrap – Proving Correctness of Bootstrap Confidence Intervals

bootstrapconfidence interval

I am working on using a bootstrap technique to compute confidence intervals of a parameter of interest.

Let $\textbf{Z}_1, … \textbf{Z}_n\in\mathbb{R}^d$ be an (iid) random sample. From this random sample, we compute $\hat{\theta}$ parameter (very ugly to express explicitly). I would like to obtain CI for this parameter.

I utilize the basic bootstrap technique. This involves randomly sampling, with replacement, from our random sample to generate multiple bootstrap samples, each matching the size of our original dataset. For each bootstrap sample, I calculate the estimate of the statistic $\hat{\theta}^\star$. Subsequently, we determine the $95%$-quantiles of the re-sampled statistics to derive the confidence intervals. I want to prove their asymptotic correctness. Is there some general result that I can use? I only found theoretical justification using Berry-Essen theorem for $\theta$=mean, but my statistic $\theta$ is much more complicated.

In summary, I want to prove something like this:

Theorem
Let $\hat{\theta}$ be an estimator (from a sample size $n$) of $\theta$ and let $\mathbb{E}||\textbf{Z}||^2<\infty$.

Denote the resamples from the original sample as $(\textbf{Z}^\star_{1,1}, \dots \textbf{Z}^\star_{1,n}),\dots, (\textbf{Z}^\star_{B,1}, \dots, \textbf{Z}^\star_{B,n})$, with corresponding estimates $\hat{\theta}_1^\star, \dots, \hat{\theta}_B^\star$ for $B\in\mathbb{N}$.

Let $U:=\hat{\theta}_{(\alpha)}^\star$ represent the $B(1-\alpha)$ largest value out of $\hat{\theta}_1^\star, \dots, \hat{\theta}_B^\star$.

Then,
$$\lim_{n\to\infty}\lim_{B\to\infty}P(\hat{\theta}<U) =\alpha. $$

Best Answer

How you prove this depends on your statistic $\theta$; there are complicated and straightforward versions depending on how difficult $\theta$ is. Fundamentally, though, the bootstrap works by the delta method; there's no need for the Berry-Esseen theorem.

A simplifying fact is that $\hat\theta$ is going to have a Normal distribution, so getting the tail probabilities asymptotically correct just implies getting the asymptotic mean and variance parameters correct

Often, your statistic $\hat\theta$ will be a differentiable function of a mean. In that case, you prove that the bootstrap is correct for the mean, and it transfers to being correct for the statistic automatically. Or your $\hat\theta$ solves $$\frac{1}{n}\sum_{i=1}^n U_i(\theta)=0$$ and $U_i$ has some regularity conditions that make $\hat\theta$ asymptotically Normal. Again, you show the bootstrap is correct for the mean, and the same arguments that make $\hat\theta$ asymptotically Normal also show the bootstrap is correct.

There's a more general argument that requires more background maths. If your data are from an iid sequence in $\mathbb{R}^d,$ then the empirical CDF is asymptotically Normal: $$\sqrt{n}(\mathbb{F}_n-F)\stackrel{w}{\to} Z$$ where $Z$ is a Gaussian process indexed by $\mathbb{R}^d.$ The delta method then says that any suitably differentiable function $\theta(\mathbb{F}_n)$ is asymptotically Normal. Since the bootstrap is correct for $\mathbb{F}_n$, in the sense that $$\sqrt{n}(\mathbb{F}^*_n-\mathbb{F})\stackrel{w}{\to} Z$$ for the same limiting $Z$ (for almost all data sequences), the delta method also says the bootstrap is correct.

A good source for this with the details is Chapter 23 of Asymptotic Statistics by van der Vaart. He starts off with the mean, and then the delta-method in finite-dimensional cases, and then the infinite-dimensional case. Finally, he talks about higher-order accuracy for sufficiently well-behaved statistics through the Edgeworth expansion.