Solved – Determining power of statistical test through bootstrapping

bootstrapstatistical-power

I'm trying to understand what's happening in the article Using simulation to estimate the power of a statistical test

I'm familiar with bootstrapping which makes sense to me. But I don't understand how bootstrapping can be used to determine power.

From the article

  1. Simulate data from the model for each group's population. These are the samples. The populations are chosen so that the true difference between the population means is δ > 0. (The null hypothesis is false.)
  2. Run the TTEST procedure on the samples. For each sample, record whether the t test rejects the null hypothesis.
  3. Count how many times the t test rejects the null hypothesis. This proportion is an estimate for the power of the test.

Is counting how many times the test rejects the null hypothesis all you need to determine power? That doesn't make sense to me.

Best Answer

Let's simplify the problem by assuming you are interested in estimating the power of a one-sample t-test for testing a population mean mu via the hypotheses Ho: mu = 0 vs Ha: mu != 0. Assume the population is normal with unknown mean mu and known standard deviation sigma = 1.

To estimate the power of the test via simulation, you would assume that mu = 2, say (or any other relevant value encompassed by the alternative hypothesis and falling in line with the magnitude of mu you would want to be able to detect via the t-test) and generate a large number N of random samples of size n from a normal population with mean mu = 2 and standard deviation sigma = 1. Using the data from each of these random samples, you would perform a one-sample t-test of Ho: mu = 0 vs Ha: mu != 0. The power of the test for rejecting Ho: mu = 0 in favour of Ha: mu != 0 would be given by the proportion of these tests where H0 was rejected in favour of Ha. The power is indexed by the value of mu used for generating the random samples (in this example, mu = 2).

When you are in a bootstrapping situation, instead of drawing N random samples from the target normal population under the assumption that mu = 2 and sigma = 1, say, you draw a single random sample. Then you treat that sample "as if" it were the entire population and can draw random samples from that assumed entire population using bootstrapping.

In the context of this simplified example,the reason you may want to use bootstrapping for estimating power is because you may actually be in a situation where you won't know whether it's sensible to assume that the underlying population is normal, but you would think that the sample distribution provides a reasonable approximation to the population distribution and since that's all you got, you have to do the best you can with that information - hence, bootstrapping.