Likelihood Estimation – Approximating the 2-Likelihood Region in R for Maximum Likelihood Analysis

confidence intervalmathematical-statisticsmaximum likelihood

By following the distribution I provided here on stackmathematics

Q. A biased die favours the number 2. It's rolled 4 times and 2 comes up twice. It's rolled again 4 times and 2 comes up once. Calculate the Likelihood and find the MLE.

The likelihood is: $L(\theta;3) = \binom{8}{3}\theta^3(1-\theta)^5$, whereas we have the local maximum at $\theta = \frac{3}{8}$.

I want to verify that, approximately $(0.0718, 0.6026)$ is a 2-likelihood region for $\theta$ by making a plot to indicate this.

From what I understand, the 2-likelihood region is given as the following:

The region $R_k = \{\theta \in \mathbb{R}: L(\theta;x) \ge e^{-k}L(\hat{\theta};x)\}$ for the k-unit likelihood region for parameters $\theta$ based on data $x$.

We know that $x=3$ and $\theta = \frac{3}{8}$, then we apply the following:

$$\binom{8}{3}\theta^3(1-\theta)^5 \ge e^{-2}L\left(\frac{3}{8};3\right)\approx0.035$$

So I think the next step is to plug $0.0718, 0.6026$ into $\theta$ to compare the inequality.

Something like:

$$\binom{8}{3}(0.0718)^3(1-0.0718)^5 \approx 0.014$$

However, $0.014 < 0.035$ and so $0.0718$ does not fall in the 2-likelihood region?

If this is the case, is there an easier way to check for values that fall within the confidence interval of the likelihood region on R?

Best Answer

Let $a=3$ and $b=5,$ so that $\hat\theta = a/(a+b).$ Let the likelihood function be $L(\theta;a,b).$ Taking logarithms, express the region as

$$R_k = \{\theta\in[0,1]\mid \log L(\theta;a,b) - \log L(\hat\theta;a,b) + k \ge 0\}.$$

Because for $k \gt 0$ there will be a lower limit $l$ between $0$ and $\hat\theta$ and an upper limit $u$ between $\hat\theta$ and $1,$ use any decent root finder within each of these intervals. Here is a sketch of the situation, showing the graph of the difference in log likelihoods as a function of $\theta$ on the interval $[0,1].$

Figure

This is a general picture of how maximum likelihood confidence intervals are found in any one-parameter situation where the likelihood is smooth and has a unique global maximum in the interior of the parameter space. (Sometimes one or both limits don't exist, though, depending on how that parameter space is defined.)

For an interval with confidence $1-\alpha,$ $2k$ is the upper $1-\alpha$ quantile of the chi-squared distribution with one degree of freedom. For instance, for $95\%$ confidence $k= 3.84\ldots/2 \approx 1.92.$

For $k=2,$ the built in uniroot function in R computes $l=0.1060\ldots$ and $u=0.7157\ldots.$ The check of any putative limit, such as those mentioned in the question, is trivial: just compare it to these endpoints.

a <- 3
b <- 5
k <- 2
Lambda <- function(x) a * log(x) + b * log(1-x)
theta.hat <- a/(a+b)
f <- function(theta) Lambda(theta) - Lambda(theta.hat)
l <- uniroot(function(x) f(x) + k, c(0, theta.hat))$root
u <- uniroot(function(x) f(x) + k, c(theta.hat, 1))$root
(c(Lower=l, Upper=u))
    Lower     Upper 
0.1060292 0.7157523

Notice it was unnecessary to compute the constant factors $\binom{8}{3}$ in Lambda (the log likelihood function) because they cancel in taking the difference of log likelihoods. This, too, is a general phenomenon.