Solved – How to interpret Bootstrap

bootstrapregressionspss

I'm a real newbie when it comes to statistics so please don't judge me and my question 😉

I'm doing a linear regression analysis with SPSS and since my data is neither normally distributed nor shows homoscedasticity, I decided to use bootstrapping.

Now, I'm really confused when it comes to the interpretation of the output. SPSS offers me the "normal" model summary and coefficients as well the bootstrap summary and bootstrap coefficients. Do I now, only interpret the bootstrap part? Or is the F-value for example still relevant, meaning that if F is not significant, I also can't interpret the bootstrap interval even though it is significant?

Best Answer

The intuitive idea behind the bootstrap is this: if your original dataset was a random draw from the full population, then if you take subsample from the sample (with replacement), then that too represents a draw from the full population. You can then estimate your model on all of those bootstrapped datasets. This gives you a large number of estimates and so you can e.g. look at the standard deviations of your estimates - it turns out that often this gives a good guess of the standard error of the estimates. Actually, the standard error of the estimates can be thought of excactly as this if you take the many datasets from the true population.

Suppose for example there is one outlier in your dataset: then in many of your bootstrapped datasets that observation is not included and so for those datasets, you see the estimated coefficients change by a lot.

Similarly, you can look at the F statistic for each of the bootstrap datasets. You could for example see how many times the model was rejected. But I am not sufficiently familiar with SPSS to know what it reports as the F stat: is it the average F statistic?