Solved – Find the approximate standard error of the bootstrap distribution

bootstraperrorself-study

I am studying for an exam, not a homework question.

According to a survey taken of 500 randomly selected US high school students, 62.9% plan to attend a four year university. A bootstrap distribution is created to find a 95% confidence interval for the true proportion of US high school students who plan to attend a four year university.

What would be the approximate standard error of the bootstrap distribution?
A. 0.0216
B. 0.0223
C. 0.0005
D. 0.0135

The actual answer is A, but I'm not sure how to get there.

What I have done:

  1. Find the standard deviation of the mean. At 95% confidence the standard deviation should be 1.96
  2. SE = 1.96 / √n; the n = 62.9% of 500 which is 314.
  3. I get 0.1106.

Not sure what I am doing wrong.

Best Answer

Let $\bar x = \frac1n \sum_{i=1}^n x_i$ where $x_i$ is the indicator variable of whether the student attends 4-year university. Your data is $\bar x = 0.63$ and $n=500$.

Let $\mathbb P_n = \frac1n \sum_{i=1}^n \delta_{x_i}$ be the empirical distribution of your observations. This simplifies to $$ \mathbb P_n = \bar x \delta_1 + (1-\bar x) \delta_0 $$ which is a Bernoulli distribution with parameter $\bar x$.

Let $x^*_i \sim \mathbb P_n$ be iid draws from this empirical distribution for $i=1,\dots,n$. $(x^*_1,\dots,x^*_n)$ is the bootstrap sample. Your bootstrap estimate of the parameter of interest is $\frac1n \sum_{i=1}^n x^*_i$ which has the mean $\bar x$ under the empirical distribution. The variance of this estimate (which will be your bootstrap estimate of the variance of the original estimator) is $$ \mathbb P_n \Big(\frac1n \sum_{i=1}^n x^*_i - \bar x\Big)^2 = \frac1n \mathbb P_n (x^*_1 -\bar x)^2 = \frac{\bar x(1-\bar x)}{n} $$ where $\mathbb P_n$ above means the expectation under the empirical measure $\mathbb P_n$. (If this is too much notation just mentally replace it with $\mathbb E$.) The first equality is by iid nature of $\{x^*_i\}$. The second equality is by the simple formula for the variance of a Bernoulli variable.

Your bootstrap estimate of the standard error is then $$ \sqrt{\frac{\bar x(1-\bar x)}{n}} = \sqrt{\frac{0.629(1-.629)}{500} } = 0.0216 $$

EDIT: Also, as far as I understand, this is the exact standard error (or standard deviation if you will) of the bootstrap distribution of the sample mean. There is no need to approximate it, since it is obtainable in closed form in this case.

Related Question