Solved – Interpreting a confidence interval.

confidence intervalinterpretation

If someone asks me to interpret the confidence interval (say at the level of a first graduate course in statistics) — are there typical templates for such an interpretation.

By typical templates I mean something like this (and please feel free to edit if you have a better way of saying this):

The said interval:

  1. Has Some-property between [[Something]] and [[Something]]
  2. Is sensitive to [[Some-property]]
  3. ….

Where strings in between [[ and ]] can be substituted by values, concepts, parameters, etc.

Another way of perhaps stating this is: is there a check list of sorts of things to interpret about a confidence interval before you move on to the case-by-case interpretation.

Thanks in advance.

Note: there are some interesting question on interpreting confidence intervals on this site, like: here, here and here – but I could not answer this question I have.

Best Answer

The key concepts with confidence intervals are coverage, correctness and accuracy.

Coverage

The coverage or confidence level should be explained first. It is the percentage of times the random interval is expected to include the true value of the parameter.

The way this is best shown is to take a probability statement about a pivotal statistic and show how the statement is inverted to get a confidence interval. An example might be getting a confidence interval for the mean $\mu$ of a normal distribution when the variance is known to be $1$. Let the sample be denoted $X_i$, $i=1,2,\ldots,n$. The students will know from undergraduate courses or have been taught earlier in this particular graduate course that the sample mean $X_b=\sum X_i/n$ is normal with mean $\mu$ and variance $1/n$. Then the pivotal quantity is $$Z= \sqrt{n} (X_b - \mu)$$ and $Z$ has a $N(0,1)$ distribution. So of course $\Pr(|Z| \le 1.96) = 0.95$ (from a table of the standard normal distribution). You do the inversion to show that this probability is the same as $\Pr(X_b-1.96/\sqrt{n} \le \mu \le X_b+1.96/\sqrt{n})$. Then the random interval $[X_b-1.96/\sqrt{n}, X_b+1.96/\sqrt{n}]$ is a prescription for a 95% confidence interval for $\mu$.

It should be clear that if you repeated an experiment many times where each time you randomly select $n$ observations from an $N(\mu, 1)$ distribution, then in close to 95% of the cases the interval will contain $\mu$ and of course it also means that in the remaining approximate 5% of the cases $\mu$ will lie outside the interval. This is how I would explain an exact 95% confidence interval.

You could certainly use some other simple example such as estimating the rate parameter for an exponential distribution. The idea is to construct a pivotal quantity whose distribution is known and is independent of any unknown parameters.

To explain the relationship between coverage and confidence, you can just point out that if you substituted $1.645$ for $1.96$ in the original probability statement for $Z$ you would get a probability of $0.90$ and hence by replacing $1.96$ by $1.645$ in the confidence interval prescription you would get a 90% confidence interval. This also illustrates how lowering the coverage tightens the width of the interval.

Accuracy

The other two important concepts Efron calls accuracy and correctness. I like using that terminology. We looked at examples of exact confidence intervals. They are accurate in the sense that the nominal coverage of 95% is the exact coverage probability. But sometimes it is convenient to use asymptotic theory. Instead of using the exact distribution for the pivotal quantity, we compute a distribution that it will converge to as the sample size $n$ goes to infinity. Using this asymptotic distribution, the coverage of an advertised 95% confidence interval will not be exact for given values of $n$. But if the approximation is good we can say that the approximate confidence interval is reasonably accurate.

(This is important in the bootstrap literature for confidence intervals because bootstrap confidence intervals are never exact and in some situations certain bootstrap variants (e.g. the BCa method) give more accurate intervals than others. The bootstrap theory on order of accuracy was developed by Peter Hall and others who defined accuracy by the rate the interval approaches the advertised confidence level as $n$ goes to infinity. The results involve the use of Edgeworth expansions and can be found detailed in Hall's book The Bootstrap and Edgeworth Expansion.)

Correctness

Last of all I would discuss correctness. For many problems there are several ways to construct confidence intervals with exact or asymptotic coverage 95%. How do we choose between them? Well they will have different average lengths. An exact confidence interval that has the shortest expected length is called correct and is the optimal one to choose. When they exist and an efficient estimator of the parameter exists, correct confidence intervals can be constructed by choosing an efficient estimate for the parameter.

Related Question