Solved – coverage of confidence interval

confidence intervalcoverage-probability

I need to check the coverage of a confidence interval, but I don't know which one of the following approaches I should use.

Approach 1:

Estimate the regression parameter $\theta$
Use the sandwich estimator to estimate the variance $\hat{\sigma}$
Create the confidence interval (assuming asymptotic normality)

Repeat it N times and get the proportion of times that the true $\theta$ falls inside the confidence interval.

Approach 2:

Estimate the regression parameter $\theta$ for N simulations
Take the standard error of $\hat{\theta}$, which returns a single number
Create N confidence intervals (assuming asymptotic normality) based on the estimates of $\theta$ and the standard error of $\hat{\theta}$
Get the proportion of times that the true $\theta$ falls inside the confidence interval

In summary, the first one we check if the true $\theta$ falls inside the confidence interval at each simulation (hence, we use the sandwich estimator for the variance). In the second one, we check only after the N simulations have been done, so that the standard error of the estimates is used to create the interval.\

They will create different confidence interval with different coverage. Which one should I use? I have seem people using both.

Best Answer

Approach 1 is the correct way to assess the coverage probability of an estimator. An estimator along with its variance estimate (which is also an estimator) are used to create a 95% CI independent of any knowledge of what future realizations of these data are. If you create N=5000 simulated realizations from the data and take the 2.5 and 97.5 quantiles, then you'll get 95% coverage each time... however, this has no connection to statistics. Each of those N=5000 realizations corresponds to one instance of the statistical experiment. The point of statistics is using the information from N=1 experiment to infer what the estimator will tend to be in the future N=4999 experiments. The estimator must be a function of a single dataset.

The confusing bit may come from bootstrapping where for each N=1 experiment, you bootstrap simulated datasets N*=1000 based on the empirical distribution of the N=1 dataset. These empirical distributions are different for each of the N=5000 simulated realizations, however.

Related Solutions

Bootstrap – Coverage Probabilities of the Basic Bootstrap Confidence Interval

The terminology is probably not used consistently, so the following is only how I understand the original question. From my understanding, the normal CIs you computed are not what was asked for. Each set of bootstrap replicates gives you one confidence interval, not many. The way to compute different CI-types from the results of a set of bootstrap replicates is as follows:

B    <- 999                  # number of replicates
muH0 <- 100                  # for generating data: true mean
sdH0 <- 40                   # for generating data: true sd
N    <- 200                  # sample size
DV   <- rnorm(N, muH0, sdH0) # simulated data: original sample

Since I want to compare the calculations against the results from package boot, I first define a function that will be called for each replicate. Its arguments are the original sample, and an index vector specifying the cases for a single replicate. It returns $M^{\star}$, the plug-in estimate for $\mu$, as well as $S_{M}^{2\star}$, the plug-in estimate for the variance of the mean $\sigma_{M}^{2}$. The latter will be required only for the bootstrap $t$-CI.

> getM <- function(orgDV, idx) {
+     bsM   <- mean(orgDV[idx])                       # M*
+     bsS2M <- (((N-1) / N) * var(orgDV[idx])) / N    # S^2*(M)
+     c(bsM, bsS2M)
+ }

> library(boot)                                       # for boot(), boot.ci()
> bOut <- boot(DV, statistic=getM, R=B)
> boot.ci(bOut, conf=0.95, type=c("basic", "perc", "norm", "stud"))
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 999 bootstrap replicates
CALL : 
boot.ci(boot.out = bOut, conf = 0.95, type = c("basic", "perc", "norm", "stud"))

Intervals : 
Level      Normal            Basic         Studentized        Percentile    
95%   ( 95.6, 106.0 )   ( 95.7, 106.2 )  ( 95.4, 106.2 )   ( 95.4, 106.0 )  
Calculations and Intervals on Original Scale

Without using package boot you can simply use replicate() to get a set of bootstrap replicates.

boots <- t(replicate(B, getM(DV, sample(seq(along=DV), replace=TRUE))))

But let's stick with the results from boot.ci() to have a reference.

boots   <- bOut$t                     # estimates from all replicates
M       <- mean(DV)                   # M from original sample
S2M     <- (((N-1)/N) * var(DV)) / N  # S^2(M) from original sample
Mstar   <- boots[ , 1]                # M* for each replicate
S2Mstar <- boots[ , 2]                # S^2*(M) for each replicate
biasM   <- mean(Mstar) - M            # bias of estimator M

The basic, percentile, and $t$-CI rely on the empirical distribution of bootstrap estimates. To get the $\alpha/2$ and $1 - \alpha/2$ quantiles, we find the corresponding indices to the sorted vector of bootstrap estimates (note that boot.ci() will do a more complicated interpolation to find the empirical quantiles when the indices are not natural numbers).

(idx <- trunc((B + 1) * c(0.05/2, 1 - 0.05/2)) # indices for sorted vector of estimates
[1] 25 975

> (ciBasic <- 2*M - sort(Mstar)[idx])          # basic CI
[1] 106.21826  95.65911

> (ciPerc <- sort(Mstar)[idx])                 # percentile CI
[1] 95.42188 105.98103

For the $t$-CI, we need the bootstrap $t^{\star}$ estimates to calculate the critical $t$-values. For the standard normal CI, the critical value will just be the $z$-value from the standard normal distribution.

# standard normal CI with bias correction
> zCrit   <- qnorm(c(0.025, 0.975))   # z-quantiles from std-normal distribution
> (ciNorm <- M - biasM + zCrit * sqrt(var(Mstar)))
[1] 95.5566 106.0043

> tStar <- (Mstar-M) / sqrt(S2Mstar)  # t*
> tCrit <- sort(tStar)[idx]           # t-quantiles from empirical t* distribution
> (ciT  <- M - tCrit * sqrt(S2M))     # studentized t-CI
[1] 106.20690  95.44878

In order to estimate the coverage probabilities of these CI-types, you will have to run this simulation many times. Just wrap the code into a function, return a list with the CI-results and run it with replicate() like demonstrated in this gist.

Confidence Interval – Is Calculating Actual Coverage Probability the Same as Calculating a Credible Interval?

In general, the actual coverage probability will never be equal to the nominal probability when you are working with a discrete distribution.

The confidence interval is defined as a function of the data. If you are working with the binomial distribution, there are only finitely many possible outcomes ($ n+1$ to be precise), so there are only finitely many possible confidence intervals. Since the parameter $ p $ is continuous, it's pretty easy to see that the coverage probability (which is a function of $ p $) can do no better than be approximately 95% (or whatever).

It is generally true that methods based on the CLT will have coverage probabilities below the nominal value, but other methods can actually be more conservative.

Best Answer

Related Solutions

Bootstrap – Coverage Probabilities of the Basic Bootstrap Confidence Interval

Confidence Interval – Is Calculating Actual Coverage Probability the Same as Calculating a Credible Interval?

Related Question