### A. Finding the form of a likelihood ratio test

The basic procedure for constructing a likelihood ratio test is of the following form:

maximize the likelihood under the null $\hat{\mathcal{L}}_0$ (by substituting the MLE under the null into the likelihood for each sample)

maximize the likelihood under the alternative $\hat{\mathcal{L}}_1$ in the same manner

Take the ratio $\Lambda = \hat{\mathcal{L}}_0/\hat{\mathcal{L}}_1$

Reject if this ratio is too small (less than some critical value $k$ chosen to give an appropriate-level test). [Some sources take $\hat{\mathcal{L}}_1/\hat{\mathcal{L}}_0$ and reject if it's too large; this is the same thing.]

So for the null, where $p_A=p_B$, you combine the samples, write the estimator of the common $p_0$ and substitute that into the binomial likelihood for the two samples (the joint likelihood is the product of the individual likelihoods due to independence).

For the alternative you substitute the MLE of the $p$ parameter for the two samples separately into the likelihood for each one, taking their product.

Then take the ratio and simplify. Note that constants that remain after cancellation can be shifted out (pushed over to the other side of the inequality with $k$), and various other transformations performed that might simplify things (as long as your inequality is correctly maintained); you should be able to get it down to a simple rejection rule in terms of some new constant $k^\prime$.

If you're particularly lucky, you can eliminate dependence on the alternative (which should mean you would have a UMP test, at least in a one-tailed alternative). In some cases you may even be able to actually compute the required $k^\prime$ to achieve the type I error rate you want, and get a small-sample test.

Otherwise, it's common to apply an asymptotic result.

[I think this particular case boils down to an equivalent statistic to the one in the 2x2 G-test.]

### B. Conducting a test directly on data -- using Wilks' theorem

For this you can just substitute the numbers straight into the likelihoods (the $\hat{\mathcal{L}}$ quantities above), compute their numerical values and take minus twice the log, comparing it with a chi-squared distribution (with df equal to the difference in the number of parameters). [NB H0 must be nested in H1 and not at a boundary for this to work]

For your specific example, I get:

```
L0 <- dbinom(15,20,39/60)*dbinom(24,40,39/60)
L1 <- dbinom(15,20,15/20)*dbinom(24,40,24/40)
-2*log(L0/L1)
[1] 1.359258
```

Applying Wilks' theorem that should be asymptotically $\chi^2_1$ under the null, so this has an asymptotic p-value of about 0.24

Edit: It turns out the G test ($G = 2\sum_i O_i \log(O_i/E_i)$) gives the same value for the test statistic (1.36), so that's encouraging, it suggests that it was correctly carried out. For comparison, with largish numbers like these the ordinary chi-square should be similar it as well, and the uncorrected chi-square is 1.32, which is quite close (p-value of 0.25).

## Best Answer

If $X$ is $\text{Binomial}(n, p)$ then MLE of $p$ is $\hat{p} = X/n$.

A binomial variable can be thought of as the sum of $n$ Bernoulli random variables. $X = \sum_{i=1}^n Y_i$ where $Y_i\sim\text{Bernoulli}(p)$.

so we can calculate the variance of the MLE $\hat{p}$ as

$$\begin{align*} \text{Var}[\hat{p}] &= \text{Var}\left[\dfrac{1}{n}\sum_{i=1}^n Y_i\right]\\ &= \dfrac{1}{n^2}\sum_{i=1}^n Var[Y_i]\\ &= \dfrac{1}{n^2}\sum_{i=1}^n p(1-p)\\ &= \dfrac{p(1-p)}{n} \end{align*}$$

So you can see that the variance of the MLE gets smaller for large $n$, and also it is smaller for $p$ close to 0 or 1. In terms of $p$ it is maximized when $p=0.5$.

For some confidence intervals you can check out Binomial Confidence Intervals