Solved – Why is the confidence interval considered a random interval

confidence intervalp-value

I've been reading a lot on confidence intervals lately and I keep seeing statements such as: "A 95% confidence interval is a random interval that contains the true parameter 95% of the time" or "A confidence interval is a random variable because x-bar (its center) is a random variable."

Why is the confidence interval considered random? If it's truly random then why bother with confidence intervals at all? Am I missing something here?

Best Answer

Why is the confidence interval considered random?

You just stated why in your question! You quoted this:

"A confidence interval is a random variable because x-bar (its center) is a random variable."

(In this case, it's presumably an interval for the mean, but the reasoning carries over to other confidence intervals.)

The sample mean is a statistic -- a quantity you calculate from the sample. Because random samples from some population are, well, random, things calculated from them are also going to be random.

Consider: If you drew a second sample from the same population would you have the same observations?

Would the sample mean be the same in both samples? Would the sample standard deviation be the same in both samples? The largest observation? The lower quartile?

No, they vary from sample to sample; indeed they're also random.

A confidence interval is also based on the random sample, so it, too, is a statistic (e.g. define it in terms of its endpoints) and it, too, is random.

If it's truly random then why bother with confidence intervals at all?
Am I missing something here?

Well presumably you'd like to use the data to calculate your interval. After all, it's the thing we have that tells us something about the population we drew the sample from.

If you're using the data - a random sample of your population - then useful quantities you calculate from it will also be random, including confidence intervals.

Random doesn't mean "ignores your data" -- for example a sample mean tells us about our population mean, and our sample standard deviation can be used to help us work out how far the sample mean will tend to be from the population mean.

In fact, we rely on the randomness - we exploit it to get the best possible use of information from our sample. Without random sampling, our intervals wouldn't necessarily tell us much of anything.

[You might like to ponder whether there might be a way to get an interval for a population quantity that is simultaneously reasonably informative and not random.]