Solved – What does it mean if your confidence interval does not contain the ‘true’ value

confidence intervalstochastic-processes

I have a sample of a stochastic process (blue) for which I have developed point estimates for each time step (red). I have then proceeded to bootstrap 95% confidence intervals for the point estimates (black), which is shown in the figure. Now as you can see, the CI doesn't always include the true value of the process. How do I interpret this?

[ Stochastic Process (Blue), point estimate line (Red), 95% CI (Black)[1]

Does this mean that my way of finding the CI might be faulty? Or is this a result that can be expected from working with real data?

Best Answer

I don't see a problem. By definition:

If confidence intervals are constructed across many separate data analyses of repeated (and possibly different) experiments, the proportion of such intervals that contain the true value of the parameter will approximately match the confidence level.

That doesn't mean your confidence interval must contain the true value. In statistics, you never know what the true value is. We don't know with 100% certainty if the true value is inside the interval or not, but we know if we repeat the process 100 times, about 95 of those intervals should bracket the unknown true value.

Related Solutions

Statistical Inference – Relationship Between Profile Likelihood and Confidence Intervals

I will not give a complete answer (I have a hard time trying to understand what you are doing exactly), but I will try to clarify how profile likelihood is built. I may complete my answer later.

The full likelihood for a normal sample of size $n$ is $$L(\mu, \sigma^2) = \left( \sigma^2 \right)^{-n/2} \exp\left( - \sum_i (x_i-\mu)^2/2\sigma^2 \right).$$

If $\mu$ is your parameter of interest, and $\sigma^2$ is a nuisance parameter, a solution to make inference only on $\mu$ is to define the profile likelihood $$L_P(\mu) = L\left(\mu, \widehat{\sigma^2}(\mu) \right)$$ where $\widehat{\sigma^2}(\mu)$ is the MLE for $\mu$ fixed: $$\widehat{\sigma^2}(\mu) = \text{argmax}_{\sigma^2} L(\mu, \sigma^2).$$

One checks that $$\widehat{\sigma^2}(\mu) = {1\over n} \sum_k (x_k - \mu)^2.$$

Hence the profile likelihood is $$L_P(\mu) = \left( {1\over n} \sum_k (x_k - \mu)^2 \right)^{-n/2} \exp( -n/2 ).$$

Here is some R code to compute and plot the profile likelihood (I removed the constant term $\exp(-n/2)$):

> data(sleep)
> difference <- sleep$extra[11:20]-sleep$extra[1:10]
> Lp <- function(mu, x) {n <- length(x); mean( (x-mu)**2 )**(-n/2) }
> mu <- seq(0,3, length=501)
> plot(mu, sapply(mu, Lp, x = difference), type="l")

profile likelihood

Link with the likelihood I’ll try to highlight the link with the likelihood with the following graph.

First define the likelihood:

L <- function(mu,s2,x) {n <- length(x); s2**(-n/2)*exp( -sum((x-mu)**2)/2/s2 )}

Then do a contour plot:

sigma <- seq(0.5,4, length=501)
mu <- seq(0,3, length=501)

z <- matrix( nrow=length(mu), ncol=length(sigma))
for(i in 1:length(mu))
  for(j in 1:length(sigma))
    z[i,j] <- L(mu[i], sigma[j], difference)

# shorter version
# z <- outer(mu, sigma, Vectorize(function(a,b) L(a,b,difference)))

contour(mu, sigma, z, levels=c(1e-10,1e-6,2e-5,1e-4,2e-4,4e-4,6e-4,8e-4,1e-3,1.2e-3,1.4e-3))

And then superpose the graph of $\widehat{\sigma^2}(\mu)$:

hats2mu <- sapply(mu, function(mu0) mean( (difference-mu0)**2 ))
lines(mu, hats2mu, col="red", lwd=2)

contour plot of L

The values of the profile likelihood are the values taken by the likelihood along the red parabola.

You can use the profile likelihood just as a univariate classical likelihood (cf @Prokofiev’s answer). For example, the MLE $\hat\mu$ is the same.

For your confidence interval, the results will differ a little because of the curvature of the function $\widehat{\sigma^2}(\mu)$, but as long that you deal only with a short segment of it, it’s almost linear, and the difference will be very small.

You can also use the profile likelihood to build score tests, for example.

Solved – Why does the train data not fall in confidence interval with scikit-learn Gaussian Process

As far as I can see, the confidence interval that you are calculating is for the mean, not for the actual values, so the points do not need to fall inside it.

Best Answer

Related Solutions

Statistical Inference – Relationship Between Profile Likelihood and Confidence Intervals

Solved – Why does the train data not fall in confidence interval with scikit-learn Gaussian Process

Related Question