Let us assume I have two data sets with n observations of data pairs of independent variable x and dependent variable y each. Let us further assume I want to generate a distribution of regression slopes for each data set by bootstrapping the observations (with replacement) N times and calculating the regression y = a + bx each time. How do I compare the two distributions in order to say the slopes are significantly different? A U-test for testing the difference between the medians of the distributions would be heavily dependent on N, that is, the more often I repeat the bootstrapping the more significant will be the difference. How do I have to calculate the overlap between the distributions to determine a significant difference?
Solved – How to compare bootstrapped regression slopes
bootstrapregressionstatistical significance
Best Answer
Bootstrapping is done to get a more robust picture of the sampling distribution than that which is assumed by large sample theory. When you bootstrap, there is effectively no limit to the number of `bootsamples' you take; in fact you get a better approximation to the sampling distribution the more bootsamples you take. It is common to use $B=10,000$ bootsamples, although there is nothing magical about that number. Furthermore, you don't run a test on the bootsamples; you have an estimate of the sampling distribution--use it directly. Here's an algorithm:
The logic of this algorithm as a statistical test is fundamentally similar to classical tests (e.g., t-tests) but you are not assuming the the data or the resulting sampling distributions have any particular distribution. (For example, you are not assuming normality.) The primary assumption you are making is that your data are representative of the population you sampled from / want to generalize to. That is, the sample distribution is similar to the population distribution. Note that, if your data are not related to the population you're interested in, you are flat out of luck.
Some people worry about using, e.g., a regression model to determine the slope if you're not willing to assume normality. However, this concern is mistaken. The Gauss-Markov theorem tells us that the estimate is unbiased (i.e., centered on the true value), so it's fine. The lack of normality simply means that the true sampling distribution may be different from the theoretically posited one, and so the p-values are invalid. The bootstrapping procedure gives you a way to deal with this issue.
Two other issues regarding bootstrapping: If the classical assumptions are met, bootstrapping is less efficient (i.e., has less power) than a parametric test. Second, bootstrapping works best when you are exploring near the center of a distribution: means and medians are good, quartiles not so good, bootstrapping the min or max necessarily fail. Regarding the first point, you may not need to bootstrap in your situation; regarding the second point, bootstrapping the slope is perfectly fine.