Confidence Interval – How to Correctly Phrase a Frequentist Confidence Interval

confidence intervalcredible-interval

I am aware that there are many, many threads on this (e.g. this excellent thread). I may have missed it but I can't seem to find one that actually explains how to accurately report a frequentist confidence using the actual numbers contained in the interval.

So say I have coefficient from a regression $\beta = 3.4$ with $CI = [0.5, 5.6]$

Bayesian Credible Interval

If the CI were a Bayesian Credible interval reporting this interval is quite straightforward:

"Given the data and the assumptions of the model, there is a 95% probability that the true value of $\beta$ lies between 0.5 and 5.6"

Frequentist Confidence Interval

Now as we know when the interval is a confidence interval reporting it properly is more tricky. Based on what I've read I would hazard

"If we ran many experiments 95% of the 95% intervals constructed would contain the true value of $\beta$"

What confuses me is that the actual numbers in the CI do not appear in this interpretation.

How do I work either the mean or CI into my reporting of the effect while staying faithful to the frequentist view of probability?

Best Answer

There are various ways you can reasonably word a confidence interval statement, but any variation on the statements below would be fine. (Since you did not specify to the contrary, I am assuming that this was a 95% confidence interval. If not then you should make the appropriate changes in the statement.) What is important is that you clearly give your confidence level and the interval of interest. You should also make sure that you properly distinguish between notation for estimates and notation for true parameter values.

  • Our estimate of the slope is $\hat{\beta} = 3.4$ ($\text{95%CI} = [0.5, 5.6]$).

  • With 95% confidence we infer that the true slope value is somewhere in the interval $0.5 \leqslant \beta \leqslant 5.6$.

  • With 95% confidence we find that $0.5 \leqslant \beta \leqslant 5.6$.

The concept of "confidence" has a clear and well-known meaning in classical statistics, so if you refer to having a certain level of confidence that a parameter falls within an interval then that will automatically be understood as reporting a confidence interval. There is no need for you to worry about staying faithful to a particular framework --- there is only one framework of probability theory used in practice$^\dagger$ and only one meaning of a confidence interval. Indeed, when other frameworks develop analogous ideas in (e.g., "credible intervals" in Bayesian inference) they make sure to use different terminology precisely to avoid confusion on this topic.

If you are just doing applied statistical work then there is no need to specify the exact statistical meaning of "confidence". It is subtle and confusing to most readers, and you can reasonably put the onus on them to read up on it if they are interested. Many people with an applied science background in any field will have done some introductory statistical courses, where they learned and then forgot what a confidence interval means. Most readers will just be satisfied with the fact that the statistical profession has given their imprimatur to the concept. For statistical experts, they will know the exact meaning of the concept, and be happy with your summary report just the same.

Worrying about the proper interpretation of "probability" is another order removed from this, and it is certainly not something you need to concern yourself with in reporting statistical analysis of data. Interested readers can dive down the rabbit-hole of philosophy and the foundations of probability theory if they really wish to do so.


$^\dagger$ There are occasional papers in academic literature that examine non-standard versions of "probability" (e.g., using complex numbers, etc.) but these are basically all bunk.

Related Question