Solved – Whether increasing the sample size influences the prediction interval

prediction intervalsample-sizeself-study

My question is similar to this question, but the solution provided didn't tell whether increasing the sample size influences the prediction interval, so I would like to ask again.

The formulae for confidence interval:
$$
\hat y \pm t_{\alpha/2, n-2} \sqrt{MSE} \sqrt{1/n + \frac{(x-\bar x)^2}{\sum (x_i – \bar x)^2}}
$$

and prediction interval:
$$
\hat y \pm t_{\alpha/2, n-2} \sqrt{MSE} \sqrt{1 + 1/n + \frac{(x-\bar x)^2}{\sum (x_i – \bar x)^2}}
$$

If the sample size is increased, the standard error on the mean outcome given a new observation will decrease, then the confidence interval will become narrower. In my mind, at the same time, the prediction interval will also become narrower which is obvious from the fomular. However, my professor told me that the increasing sample size does not influence too much the prediction interval, so I am confused now. Could anybody give me some explanation?

Best Answer

Confidence interval is an estimate of an interval in which mean of observations will fall when x=xi In its formula $$ 1/n + \frac{(x-\bar x)^2}{\sum (x_i - \bar x)^2} $$ Tends to 0.

Prediction interval is an estimate of an interval in which future individual observations will fall when x=xi In its formula $$ 1 + 1/n + \frac{(x-\bar x)^2}{\sum (x_i - \bar x)^2} $$ tends to 1

That means that the confidence interval for the mean of the outcomes at xi gets smaller as sample size grows. (as Central limit Theorem would suggest) which means that by increase of the sample size our estimate for the average (mean) outcome for xi gets better.

$$ \lim_{n->infinity}{CI = \hat y}$$

But the dispersion of the distribution of y|xi "the probability of an individual outcome" at xi, Doesn't change very much because central limit theorem is related to central tendencies not to individual behavior or outcomes. Therefore the prediction interval doesn't change very much. $$ \lim_{n->infinity}{PI = \hat y \pm t_{\alpha/2, n-2} \sqrt{MSE}}$$

Individual behavior remains uncertain no matter how much you increase your sample size ;)

Related Question