[Math] Likelihood Ratio Test for Linear Regression

hypothesis testingstatistics

I apologize for the image I am posting below. I am new to StackExchange and I am not yet familiar with the MathJaX equations, so I took a screenshot.

Here is my question:

Let the independent random variables $Y_1, \ldots , Y_n$ have the following joint pdf where $x_1, \ldots , x_n$ are not all equal. We have the null hypothesis specified below and we want to use a likelihood ratio test to test the null hypothesis against all possible alternative hypotheses.
$$
L(\alpha, \beta, \sigma ^ 2) =
\left(\frac{1}{2 \pi \sigma ^ 2} \right) ^ {n / 2}
\exp\left\{-\frac{1}{2 \sigma ^ 2}
\sum_{i = 1}^n \left[y_i – \alpha – \beta (x_i – \bar x)\right] ^ 2
\right\}
$$
$$
H_0: \beta = 0 \text{ ($\alpha$ and $\sigma^2$) unknown}
$$

The question asks to find a the likelihood test statistic and check to see if it can be based on a familiar test statistics.

So far, all I know is that I believe it will be based on a $T$-statistic, but I do not know how to show this.

Best Answer

To calculate the likelihood ratio test, you first calculate the maximum likelihood of your full assumed model. So you'll pretend that the triple $(\alpha, \beta, \sigma^{2})$ are all unknown, and use either analytic or numerical methods to compute the MLE estimator for these parameters given your data, by maximizing the expression you provided for $L(\alpha,\beta,\sigma^{2})$.

For convenience, let $\hat{\theta}_{\textrm{F}} = (\alpha_{F}, \sigma_{F}^{2}; \beta_{F})$, where 'F' is meant to stand for 'Full' since you're using the full set of parameters when figuring out the MLE.

Next, assume that your null hypothesis is correct and that $\beta=0$. Then let $\hat{\theta}_{R} = (\alpha_{R}, \sigma_{R}^{2}; 0)$, where we plug in the null value of $\beta$ and then estimate the MLE with that fixed assumption. The 'R' here stands for 'Restricted' since we're estimating the MLE with the extra restriction on $\beta$.

Then with this notation, the likelihood ratio test statistic is given by $$ LR = 2\cdot{}\biggl( L(\hat{\theta}_{F}) - L(\hat{\theta}_{R})\biggr).$$

Assuming the null hypothesis is true, and for large values of $N$ (large sample sizes), then $LR$ has a $\chi^{2}$ distribution with degrees of freedom equal to $K_{0}$ where $K_{0}$ is the dimension of the set of parameters being restricted by the hypothesis. In this case, you're just restricting the value of a single scalar, $\beta$, so $K_{0}$ is 1. But if your null hypothesis involved multiple variables, the degrees of freedom wold change accordingly.

The idea behind this test is that if the null hypothesis is true, then the value of the likelihood function shouldn't be much different when you find the unrestricted MLE vs. when you find the MLE with the null-hypothesis-restriction applied.

The way that the large sample distribution is proved is by looking at the convergence of the (negative) outer product of the likelihood derivatives and the convergence of the (negative) Hessian matrix of the likelihood. Both of these converge to the Information Matrix, and so you can basically do a series expansion of the likelihood function around the true parameter. Cutting off the series expansion at the quadratic term, and looking at the difference at the restricted parameter estimate vs. the unrestricted parameter estimate lets you show that it is distributed as $\chi^{2}(K_{0})$.

Related Question