confidence-interval – Confidence Interval Coverage for Discrete Functions: In-Depth Analysis

confidence intervaldiscrete data

How to calculate discrete interval coverage?

What I know how to do:

If I had a continuous model, I could define a 95% confidence interval for each of my predicted values, and then see how often the actual values were within the confidence interval. I might find that only 88% of the time did my 95% confidence interval cover the actual values.

What I don't know how to do:

How do I do this for a discrete model, such as poisson or gamma-poisson? What I have for this model is as follows, taking a single observation (out of over 100,000 I plan to generate:)

Observation #: (arbitrary)

Predicted value: 1.5

Predicted probability of 0: .223

Predicted probability of 1: .335

Predicted probability of 2: .251

Predicted probability of 3: .126

Predicted probability of 4: .048

Predicted probability of 5: .014 [and 5 or more is .019]

…(etc)

Predicted probability of 100 (or to some otherwise unrealistic figure): .000

Actual value (an integer such as "4")

Note that while I've given poisson values above, in the actual model a predicted value of 1.5 may have different predicted probabilities of 0,1,…100 across observations.

I'm confused by the discreteness of the values. A "5" is obviously outside the 95% interval, since there's only .019 at 5 and above, which is less than .025. But there will be a lot of 4's — individually they are within, but how do I jointly evaluate the number of 4's more appropriately?

Why do I care?

The models I'm looking at have been criticized for being accurate at the aggregate level but giving poor individual predictions. I want to see how much worse the poor individual predictions are than the inherently wide confidence intervals predicted by the model. I'm expecting the empirical coverage to be worse (e.g. I might find 88% of the values lie within the 95% confidence interval), but I hope only a bit worse.

Best Answer

Neyman's confidence intervals make no attempt to provide coverage of the parameter in the case of any particular interval. Instead they provide coverage over all possible parameter values in the long run. In a sense they attempt to be globally accurate at the expense of local accuracy.

Confidence intervals for binomial proportions offer a clear illustration of this issue. Neymanian assessment of intervals yields the irregular coverage plots like this, which is for 95% Clopper-Pearson intervals for n=10 Binomial trials:

Clopper-Pearson coverage plot

There is an alternative way to do coverage, one that I personally think is much more intuitively approachable and (thus) useful. The coverage by intervals can be specified conditional on the observed result. That coverage would be local coverage. Here is a plot showing local coverage for three different methods of calculation of confindence intervals for binomial proportions: Clopper-Pearson, Wilson's scores, and a conditional exact method that yield intervals identical to Bayesian intervals with a uniform prior:

Conditional coverage for three types of interval

Notice that the 95% Clopper-Pearson method gives over 98% local coverage but the exact conditional intervals are, well, exact.

A way to think of the difference between the global and local intervals is to consider the global to be inversions of Neyman-Pearson hypothesis tests where the outcome is a decision that is made on the basis of consideration of long-term error rates for the current experiment as a member of the global set of all experiments that might be run. The local intervals are more akin to inversion of Fisherian significance tests which yield a P value which represents evidence against the null in from this particular experiment.

(As far as I know, the distinction between global and local statistics was first made in an unpublished Master’s thesis by Claire F Leslie (1998) Lack of confidence : a study of the suppression of certain counter-examples to the Neyman-Pearson theory of statistical inference with particular reference to the theory of confidence intervals. That thesis is held by the Baillieu library at The University of Melbourne.)

Using Scortchi's Suggestion here is the revised code:

#Scortchi's suggestion
set.seed(44)
prob = .2

#More data arrives
x.new = rnorm(10000)
y.new = rbinom(10000,1,prob)

#The number of bootstrap samples
B = 10000

sum.p = rep(NA,B)
for(i in 1:B){
    #Create a bootstrap sample
    index = sample(1:length(x.new),length(x.new)*.1,replace=TRUE)
    x.boot = x.new[index]       
    y.boot = y.new[index]

    model = glm(y.boot~x.boot,family="binomial")

    #Calculate the sum of p
    sum.p[i] = sum(1/(1+exp(-(model$coef[1]+model$coef[2]*x.boot))))
}

#Get the 2.5% and 97.5% quantile from the bootstrap estimator 
lower = quantile(sum.p,prob=.025)
upper = quantile(sum.p,prob=.975)

#Construct a 95% confidence interval
ci = c(lower,upper)

Now interestingly, the confidence interval from using Scortchi's suggestion results in

> ci
 2.5% 97.5% 
  174   223

where as using my original code we obtain the following:

> ci
    2.5%    97.5% 
242.4727 247.0230

So there is clearly a difference between the two methods.

Confidence Interval – Is Calculating Actual Coverage Probability the Same as Calculating a Credible Interval?

In general, the actual coverage probability will never be equal to the nominal probability when you are working with a discrete distribution.

The confidence interval is defined as a function of the data. If you are working with the binomial distribution, there are only finitely many possible outcomes ($ n+1$ to be precise), so there are only finitely many possible confidence intervals. Since the parameter $ p $ is continuous, it's pretty easy to see that the coverage probability (which is a function of $ p $) can do no better than be approximately 95% (or whatever).

It is generally true that methods based on the CLT will have coverage probabilities below the nominal value, but other methods can actually be more conservative.

Best Answer

Related Solutions

Solved – Logistic regression, “sum” confidence interval

Using Scortchi's Suggestion here is the revised code:

Confidence Interval – Is Calculating Actual Coverage Probability the Same as Calculating a Credible Interval?

Related Question