Maximum Likelihood Estimates – Finding Maximum Likelihood Estimates for Multiple Normal Populations

likelihood-ratiomaximum likelihoodnormal distributionself-study

I've just started studying maximum likelihood and likelihood ratio tests. I've calculated the maximum likelihood of a normal population with unknown mean and variance. However, I've been given this problem:

Independent random samples of size $n_1, n_2, … ,n_k$ from $k$ normal populations with unknown means and
equal but unknown variances are to be used to test the null hypothesis $H_0: \mu_1=\mu_2=…=\mu_k$ versus the
alternative that these means are not all equal.

Find the restricted and unrestricted MLEs of $\mu_1…\mu_k$ and $\sigma$ and find the likelihood ratio statistic.

The multiple $\mu$s are really throwing me for a loop. So far I think I have the likelihood function correct: $$\prod\limits_{i=1}^k \frac{1}{\sqrt{2\pi}\sigma}\exp\left\{-\frac{1}{2\sigma^2}\sum_{j=1}^{n_i} (x_{ij}-\mu)^2\right\}$$

However, I'm a little concerned as to how I approach this from here. I understand I'm supposed to take the log of the function then take the partial derivatives of $\mu$ and $\sigma$, however the products and sums are throwing me off a bit. Is there a different approach to solving this problem or am I just over thinking it? Did I even set it up correctly? Any help would be greatly appreciated. Thanks!

Best Answer

As hinted in a comment, you have to set up an hypothesis test. Your hypothesis $H_0$ is that the mean is the same. In that case, the likelihood will be $$ \mathcal{L} = \prod_{i=1}^n\frac1{\sqrt{2\pi}\sigma} \exp\left(-\frac{(x_i-\mu)^2}{2\sigma^2}\right) $$ where the product is in all the data points together ($n=\sum_k n_k$). We know that the optimal estimator is $$ \hat{\mu}=\frac1{n}\sum_{i=1}^nx_i $$ and $$ \hat{\sigma}^2=\frac1{n}\sum_{i=1}^nx_i^2-\left(\frac1{n}\sum_{i=1}^nx_i\right)^2 $$ The value of the log-likelihood you get it from replacing in the previous formula is $$ \log\mathcal{L}_0=-n(\log(\hat{\sigma}\sqrt{2\pi})+1/2) $$

Your hypothesis $H_1$ is that the mean is different from each sample. Since the samples are indipendent, the likelihood of that is simply the product of the likelihoods of each sample: $$ \mathcal{L} = \prod_{j=1}^k \prod_{i=1}^{n_j}\frac1{\sqrt{2\pi}\sigma} \exp\left(-\frac{(x_i-\mu_j)^2}{2\sigma^2}\right) $$ The solution for the $\mu_j$ is analogous: $$ \hat{\mu}_j=\frac1{n_j}\sum_{i=1}^{n_j}x_i $$ before replacing it in the likelihood formula, let's do some 'massage' $$ \mathcal{L} = \frac1{(\sqrt{2\pi}\sigma)^n} \prod_{j=1}^k \prod_{i=1}^{n_j} \exp\left(-\frac{(x_i-\hat{\mu}_j)^2}{2\sigma^2}\right) $$ then the product becomes $$ \prod_{j=1}^k \prod_{i=1}^{n_j} \exp\left(-\frac{(x_i-\mu_j)^2}{2\sigma^2}\right) = \exp\left( \sum_{j=1}^k \sum_{i=1}^{n_j} -\frac{(x_i-\hat{\mu}_j)^2}{2\sigma^2} \right) $$ apply logarithm $$ \log\mathcal{L}_1=-n\log(\hat{\sigma}\sqrt{2\pi})-\left( \sum_{j=1}^k \sum_{i=1}^{n_j} \frac{(x_i-\hat{\mu}_j)^2}{2\hat{\sigma}^2} \right) $$ Let's call $\hat{\sigma}^2_j$ the variance of the sample $j$; we have that $$ \log\mathcal{L}_1=-n\log(\hat{\sigma}\sqrt{2\pi})-\left( \sum_{j=1}^k \frac{n_j \hat{\sigma}_j^2}{2\hat{\sigma}^2} \right) $$ by derivating and equating to $0$ you get that $$ \hat{\sigma}^2=\sum_{j=1}^k\frac{n_j}{n}\hat{\sigma}_j^2 $$ and we get $$ \log\mathcal{L}_1=-n(\log(\sum_j n_j/n \hat{\sigma}^2_j \sqrt{2\pi})+1/2) $$ The likelihood ratio, can be computed as the substraction of the two log-likelihoods: $$ \log(\mathcal{L}_1/\mathcal{L}_0)=-n\log\left( \frac{ \frac1n \sum_{i=1}^n (x_i-\mu)^2 }{ \frac1n \sum_{j=1}^k \sum_{i=1}^{n_j}(x_i-\mu_j)^2}\right) $$ You can replace the values of $\mu$, you get eventually to $$ \Lambda = e^n\exp\left(1-\frac{\frac1{n^2}\sum_{j=1}^k\frac{n}{n_j}\left(\sum_{i=1}^{n_j}x_i\right)^2-\frac1{n^2}\left(\sum_{i=1}^nx_i^2\right)}{\frac1n\sum_{i=1}^nx_i^2-\frac1{n^2}\left(\sum_{i=1}^nx_i\right)^2}\right) $$