I am having trouble to understand how to use bootstrapping to calculate prediction intervals for a linear regression model. Can somebody outline a step-by-step procedure? I searched via google but nothing really makes sense to me.
I do understand how to use bootstrapping for calculating confidence intervals for the model parameters.
Best Answer
Confidence intervals take account of the estimation uncertainty. Prediction intervals add to this the fundamental uncertainty. R's
predict.lm
will give you the prediction interval for a linear model. From there, all you have to do is run it repeatedly on bootstrapped samples.The result of
replicate
is a 3-dimensional array (n
x3
xn.bs
). The length 3 dimension consists of the fitted value for each data element, and the lower/upper bounds of the 95% prediction interval.Gary King method
Depending on what you want, there's a cool method by King, Tomz, and Wittenberg. It's relatively easy to implement, and avoids the problems of bootstrapping for certain estimates (e.g.
max(Y)
).I'll quote from his definition of fundamental uncertainty here, since it's reasonably nice: