Solved – Interpreting a 95% confidence interval

confidence interval

I originally posted the following as a partial answer to a question asking why a 95% confidence interval does not imply that there is a 95% chance that the interval contains the true mean (see: Why does a 95% Confidence Interval (CI) not imply a 95% chance of containing the mean?). A commenter (thanks to John) subsequently asked me to post the comment as a separate question, so here goes.

Firstly, I'm going to assume that if I select a playing card at random from a standard deck, the probability that I've selected a club (without looking at it) is 13 / 52 = 25%.

And secondly, it's been stated many times that a 95% confidence interval should be interpreted in terms of repeating an experiment multiple times and the calculated interval will contain the true mean 95% of the time – I think this was demonstated reasonably convincingly by James Waters simulation in the question linked above. Most people seem to accept this interpretation of a 95% CI.

Now, for the thought experiment. Let's assume that we have a normally distributed variable in a large population – maybe heights of adult males or females. I have a willing and tireless assistant whom I task with performing multiple sampling processes of a given sample size from the population and calculating the sample mean and 95% confidence interval for each sample. My assistant is very keen and manages to measure all possible samples from the population. Then, for each sample, my assistant either records the resulting confidence interval as green (if the CI contains the true mean) or red (if the CI doesn't contain the true mean). Unfortunately, my assistant will not show me the results of his experiments. I need to get some information about the heights of adults in the population but I only have time, resources and patience to do the experiment once. I make a single random sample (of the same sample size used by my assistant) and calculate the confidence interval (using the same equation).

I have no way of seeing my assistant's results. So, what is the probability that the random sample I have selected will yield a green CI (i.e. the interval contains the true mean)?

In my mind, this is the same as the deck of cards situation outlined previously and can be interpreted that there is a 95% probability that the interval calculated using my sample is green (i.e. contains the true mean). And yet, the concensus seems to be that a 95% confidence interval can NOT be interpreted as there being a 95% probability that the interval contains the true mean. Why (and where) does my reasoning in the above thought experiment fall apart?

Best Answer

The confusion comes from this sentence:

And yet, the consensus seems to be that a 95% confidence interval can NOT be interpreted as there being a 95% probability that the interval contains the true mean.

It is a partial misunderstanding of the real consensus. The confusion comes from not being specific about what probability we talk about. Not as a philosophical question but as "what exact probability we are speaking of in the context". As @ratsalad says it's all about conditioning.

Call $\theta$ your parameter, $X$ your data, $I$ an interval that is a function of $X$:

  • $I$ is a confidence interval means $P(\theta\in I\mid\theta)>0.95$ for all possible $\theta$ including the true one. Probability averages over all possible $X$ at fixed $\theta$. This is what you explain in your interpretation.
  • $I$ being a (Bayesian) credible interval says $P(\theta\in I\mid X)>0.95$. Probability averages over all possible $\theta$ at fixed $X$.

Both are probability of the same event but conditioned differently.

The reason why one discourages saying "the probability that $\theta$ is in $I$ is 0.95" for confidence intervals is because this sentence implicitly means the second point: when we say "the probability that..." the conditioning is implicitly to what has been observed before: "I have seen some $X$, now what is the probability that $\theta$ is..." is formally "what is $P(\theta...\mid X)$".

This implicit is reinforced by the (again implicit) suggestion you experience when reading "probability that $\theta$ is in $I$" that $\theta$ is the variable and $I$ the fixed object, while in frequentist analysis it is the opposite.

Finally this is made even worse when you replace $I$ by your calculated interval. If you write: "The probability that $\theta$ is in $[4;5]$ is 0.95" then this is simply false. In frequentist analysis "$\theta$ is in $[4;5]$" is either true or false but is not a random event thus it does not have a probability (other than 0 or 1). Thus the sentence could only be meaningfully interpreted as the Bayesian one.

Related Question