Assuming you are working with a sample of size $n$, the likelihood function given the sample $(x_1,\ldots,x_n)$ is of the form
$$L(\lambda)=\lambda^n\exp\left(-\lambda\sum_{i=1}^n x_i\right)\mathbf1_{x_1,\ldots,x_n>0}\quad,\,\lambda>0$$
The LR test criterion for testing $H_0:\lambda=\lambda_0$ against $H_1:\lambda\ne \lambda_0$ is given by
$$\Lambda(x_1,\ldots,x_n)=\frac{\sup\limits_{\lambda=\lambda_0}L(\lambda)}{\sup\limits_{\lambda}L(\lambda)}=\frac{L(\lambda_0)}{L(\hat\lambda)}$$
, where $\hat\lambda$ is the unrestricted MLE of $\lambda$.
A routine calculation gives $$\hat\lambda=\frac{n}{\sum_{i=1}^n x_i}=\frac{1}{\bar x}$$
Then we have
$$\Lambda(x_1,\ldots,x_n)=\lambda_0^n\,\bar x^n \exp(n(1-\lambda_0\bar x))=g(\bar x)\quad,\text{ say }$$
Now study the function $g$ to justify that $$g(\bar x)<c \iff \bar x<c_1\quad\text{ or }\quad \bar x>c_2$$
, for some constants $c_1,c_2$ determined from the level $\alpha$ restriction
$$P_{H_0}(\overline X<c_1)+P_{H_0}(\overline X>c_2)\leqslant \alpha$$
You are given an exponential population with mean $1/\lambda$. So we can multiply each $X_i$ by a suitable scalar to make it an exponential distribution with mean $2$, or equivalently a chi-square distribution with $2$ degrees of freedom. Note the transformation
\begin{align}
X_i\stackrel{\text{ i.i.d }}{\sim}\text{Exp}(\lambda)&\implies 2\lambda X_i\stackrel{\text{ i.i.d }}{\sim}\chi^2_2
\\&\implies 2\lambda \sum_{i=1}^n X_i\sim \chi^2_{2n}
\end{align}
That is, we can find $c_1,c_2$ keeping in mind that under $H_0$, $$2n\lambda_0 \overline X\sim \chi^2_{2n}$$
We use this particular transformation to find the cutoff points $c_1,c_2$ in terms of the fractiles of some common distribution, in this case a chi-square distribution.
You can verify that The LR test will reject the null hypothesis $H_0: p \ge 0.0408$ in favor
of the alternative $H_a: p < 0.0408$ for sufficiently small values of $\hat p = x/n,$ which is to say small values of $x.$
Then for $n = 1225, \hat p=0.020408,$ the P-value of your LR test will be $P(X \le 25\, |\, n=1225, p=0.0408) \approx 0.$ This P-value is much smaller than $0.05 = 5\%,$ so you reject $H_0$ in favor of $H_a.$
R code for figure:
n = 1225; p=0.0408; x = 0:90; PDF = dbinom(x,n,p)
hdr = "PDF of BINOM(1225, 0.0408)"
plot(x, PDF, type="h", col="blue", lwd=2, main=hdr)
abline(h=0, col="green2")
abline(v=0, col="green2")
abline(v = 25.5, col="red", lwd=2, lty="dotted")
Because $n$ is so large, you could
approximate this P-value with a normal approximation. The exact binomial probability
can be found from R as shown below:
pbinom(25, 1225, 0.0408)
[1] 5.508296e-05
Note: Testing with a discrete probability distribution such as binomial, it not not usually possible to do a (nonrandomized) test at exactly the 5% level.
But if you used $c = 38$ as the 'critical value' for the test (that is, rejecting $H_0$ for $X\le c),$ you would have significance level $\alpha=4.43\%,$
qbinom(.05, 1225, .0408)
[1] 39
pbinom(39, 1225, .0408)
[1] 0.06102795
pbinom(38, 1225, .0408)
[1] 0.04434609
If you use a normal approximation to approximate the critical value, it may seem
that you are testing at level 5%, but there are
no possible (integer) values of $x$ that give z-scores very near to -1.645.
qnorm(.05)
[1] -1.644854
Best Answer
To calculate the likelihood ratio test, you first calculate the maximum likelihood of your full assumed model. So you'll pretend that the triple $(\alpha, \beta, \sigma^{2})$ are all unknown, and use either analytic or numerical methods to compute the MLE estimator for these parameters given your data, by maximizing the expression you provided for $L(\alpha,\beta,\sigma^{2})$.
For convenience, let $\hat{\theta}_{\textrm{F}} = (\alpha_{F}, \sigma_{F}^{2}; \beta_{F})$, where 'F' is meant to stand for 'Full' since you're using the full set of parameters when figuring out the MLE.
Next, assume that your null hypothesis is correct and that $\beta=0$. Then let $\hat{\theta}_{R} = (\alpha_{R}, \sigma_{R}^{2}; 0)$, where we plug in the null value of $\beta$ and then estimate the MLE with that fixed assumption. The 'R' here stands for 'Restricted' since we're estimating the MLE with the extra restriction on $\beta$.
Then with this notation, the likelihood ratio test statistic is given by $$ LR = 2\cdot{}\biggl( L(\hat{\theta}_{F}) - L(\hat{\theta}_{R})\biggr).$$
Assuming the null hypothesis is true, and for large values of $N$ (large sample sizes), then $LR$ has a $\chi^{2}$ distribution with degrees of freedom equal to $K_{0}$ where $K_{0}$ is the dimension of the set of parameters being restricted by the hypothesis. In this case, you're just restricting the value of a single scalar, $\beta$, so $K_{0}$ is 1. But if your null hypothesis involved multiple variables, the degrees of freedom wold change accordingly.
The idea behind this test is that if the null hypothesis is true, then the value of the likelihood function shouldn't be much different when you find the unrestricted MLE vs. when you find the MLE with the null-hypothesis-restriction applied.
The way that the large sample distribution is proved is by looking at the convergence of the (negative) outer product of the likelihood derivatives and the convergence of the (negative) Hessian matrix of the likelihood. Both of these converge to the Information Matrix, and so you can basically do a series expansion of the likelihood function around the true parameter. Cutting off the series expansion at the quadratic term, and looking at the difference at the restricted parameter estimate vs. the unrestricted parameter estimate lets you show that it is distributed as $\chi^{2}(K_{0})$.