Solved – How to correctly weight predicted values from a fitted linear model

rregression

I am using R to fit a linear model.

My code is:

rating_lm <- lm(rating\$flow ~ I(rating\$raw^2) + rating\$raw, data = rating, weights = 1/(rating\$flow)

I then use the following code to get prediction intervals:

b <- predict(rating_lm, interval = "prediction")

The graph below shows: the fitted line (red line), the data points and the prediction intervals (blue lines).

enter image description here

I used the weighting 1/rating\$flow because we are much more confident in the low measured Y values.

I need to use the fitted linear model in a predictive way with new X data. However, when doing this, I have found that the predicted intervals for the new data are not close to those of the fitted model.

My question is: how can I ensure that the new predicted values, have the same (or very similar), predicted intervals as the fitted model?

Best Answer

When you fit a a linear model and generate prediction intervals you assume the model form holds outside the range of the data you used to fit it. The only difference between a confidence interval for the model estimate at a particular point and a prediction interval is the added uncertainty of an independent random error.

Statisticians often warn that it is dangerous to extrapolate a regression model outside the range of the data. That could what is going on here. If you are trying to predict outside the range and the model form does not extend then observed points can lie far outside the prediction interval. The problem is that the implicit assumption with prediction intervals that the model extends is violated.

Related Question