Solved – Maximum Likelihood Estimation

maximum likelihood

If $V_1, V_2,\ldots V_{n_1}$ and $W_1, W_2,\ldots,W_{n_2}$ are independent random samples of size $n_1$ and $n_2$ from normal populations with the means $\mu_1$, $\mu_2$ and the common variance $\sigma^2$, find maximum likelihood estimators for $\mu_1, \mu_2$ and $\sigma^2$.

My idea is to separately find FOC for sample 1 and for sample 2 and then use the 4 equations I'll found to estimate the parameters. Any suggestions? Furthermore: is it a problem the difference in sample size?

Thanks!

Best Answer

According to the question, it is a an assumed fact that both populations have common variance, and not something one wishes to test. Maximum likelihood estimators can be derived as usual either from the two samples separately, or by pooling them, in which case we will have an independent but non-identically distributed sample and corresponding log-likelihood, something that nevertheless creates no special issues. So, more than deriving the MLEs (which is straightforward), I would say that this is a good example in order to examine whether pooling samples ("unite and conquer"?) is more beneficial than keeping the samples separate ("divide and conquer"?). But "more beneficial" according to which criteria?

We will discuss them as we go along.

Note that we need both sample sizes to be larger than unity, $n_1 >1, n_2 > 1$, otherwise the variance estimator will equal zero.

If we keep the samples separate we will obtain

$$\hat \mu_v = \frac 1{n_1}\sum_{i=1}^{n_1}v_i,\;\;\; \hat \sigma^2_1 = \frac 1{n_1}\sum_{i=1}^{n_1}(v_i-\hat \mu_v)^2$$

and $$\hat \mu_w = \frac 1{n_2}\sum_{i=1}^{n_2}w_i,\;\;\; \hat \sigma^2_2 = \frac 1{n_2}\sum_{i=1}^{n_2}(w_i-\hat \mu_w)^2$$

The MLEs for the means will be unbiased, efficient, consistent and asymptotically normal.

The variance estimators will be biased, consistent and asymptotically normal (see this post, which holds in general, even for normal samples).

Since we have bias here, it is an easy thought to turn to Mean Squared Error. The populations are normal, so we also have a finite-sample result:

$$\frac {n_i\hat \sigma^2_i}{\sigma^2} \sim \chi^2_{n_i-1} \Rightarrow \hat \sigma^2_i \sim \operatorname{Gamma}(k_i,\theta_i),\;\; k_i = \frac {n_i-1}{2},\;\; \theta_i = \frac {2\sigma^2}{n_i},\;\;i=1,2$$

Therefore we can calculate the Mean Squared Error (MSE) as

$$MSE(\hat \sigma^2_i) = \text{Var}(\hat \sigma^2_i)+\left[B(\hat \sigma^2_i)\right]^2 = \frac{2(n_i-1)}{n_i^2} \sigma^4 + \frac 1{n_i^2}\sigma^4 = \frac{2n_i-1}{n_i^2} \sigma^4$$

We turn now to the pooled-samples case.
It is easy to verify that the MLE's for the two means will be identical with the separate-samples approach. So as regards these estimators, pooling the two samples or not, makes no difference as regards the functional form of the estimators, or their properties.

But the variance estimator will be different. It is also rather easy to derive that

$$\hat \sigma^2_p = \frac{n_1}{n_1+n_2}\hat \sigma^2_1+\frac{n_2}{n_1+n_2}\hat \sigma^2_2$$

This is also a biased an consistent estimator, and also asymptotically normal, being the convex combination of two asymptotically normal variables.

Turning to the issue of bias and Mean Squared Error, since the two separate-samples estimators are independent we have that

$$\text{Var}(\hat \sigma^2_p) = \frac{n_1^2}{(n_1+n_2)^2}\frac{2(n_1-1)}{n_1^2} \sigma^4+\frac{n_2^2}{(n_1+n_2)^2}\frac{2(n_2-1)}{n_2^2}\sigma^4 = \frac {2n_1+2n_2-4}{(n_1+n_2)^2}\sigma^4$$

and

$$B\left(\hat \sigma^2_p\right) = \frac{n_1}{n_1+n_2}E(\hat \sigma^2_1)+\frac{n_2}{n_1+n_2}E(\hat \sigma^2_2) - \sigma^2 = \frac {-2}{n_1+n_2} \sigma^2$$

So the MSE here is

$$MSE(\hat \sigma^2_p) = \frac {2n_1+2n_2-4}{(n_1+n_2)^2}\sigma^4+\frac {4}{(n_1+n_2)^2} \sigma^4 = \frac {2}{n_1+n_2}\sigma^4$$

In order for sample-pooling to be superior in MSE terms we want that

$$MSE(\hat \sigma^2_p) < MSE(\hat \sigma^2_i), i=1,2$$

$$\Rightarrow \frac {2}{n_1+n_2}\sigma^4 < \frac{2n_i-1}{n_i^2} \sigma^4 \Rightarrow 2n_i^2 < 2n_in_1 - n_1 + 2n_in_2 - n_2$$

This reduces to the same condition for either $i=1$ or $i=2$, namely $$0 < - n_1 + 2n_1n_2 - n_2 \Rightarrow \frac {n_1+n_2}{n_1n_2} < 2 \Rightarrow \frac 1{n_2} + \frac {1}{n_1} < 2$$

which holds, since both sample sizes are strictly higher than unity.

Therefore we conclude, that "unite & conquer" is the MSE-efficient approach here.

But we will lose something: if $n_1 \neq n_2$ the pooled-sample variance estimator does not give a Gamma finite sample distributional result, because it is the linear combination of two Gamma random variables with different scale parameters (different $\theta_i$'s). This does not result into a Gamma, but into a rather complicated infinite sum expression (see this paper). Which means that for conducting tests related to the pooled-sample variance estimator, we will have to resort to the asymptotic normality result.

Alternatively, if the difference between $n_1$ and $n_2$ is not large, and both samples have respectable sizes, we may even consider dropping observations from the larger sample in order to make $n_1 =n_2$ and preserve the Gamma distribution result.

Related Solutions

Maximum Likelihood Estimates – Finding Maximum Likelihood Estimates for Multiple Normal Populations

As hinted in a comment, you have to set up an hypothesis test. Your hypothesis $H_0$ is that the mean is the same. In that case, the likelihood will be $$ \mathcal{L} = \prod_{i=1}^n\frac1{\sqrt{2\pi}\sigma} \exp\left(-\frac{(x_i-\mu)^2}{2\sigma^2}\right) $$ where the product is in all the data points together ($n=\sum_k n_k$). We know that the optimal estimator is $$ \hat{\mu}=\frac1{n}\sum_{i=1}^nx_i $$ and $$ \hat{\sigma}^2=\frac1{n}\sum_{i=1}^nx_i^2-\left(\frac1{n}\sum_{i=1}^nx_i\right)^2 $$ The value of the log-likelihood you get it from replacing in the previous formula is $$ \log\mathcal{L}_0=-n(\log(\hat{\sigma}\sqrt{2\pi})+1/2) $$

Your hypothesis $H_1$ is that the mean is different from each sample. Since the samples are indipendent, the likelihood of that is simply the product of the likelihoods of each sample: $$ \mathcal{L} = \prod_{j=1}^k \prod_{i=1}^{n_j}\frac1{\sqrt{2\pi}\sigma} \exp\left(-\frac{(x_i-\mu_j)^2}{2\sigma^2}\right) $$ The solution for the $\mu_j$ is analogous: $$ \hat{\mu}_j=\frac1{n_j}\sum_{i=1}^{n_j}x_i $$ before replacing it in the likelihood formula, let's do some 'massage' $$ \mathcal{L} = \frac1{(\sqrt{2\pi}\sigma)^n} \prod_{j=1}^k \prod_{i=1}^{n_j} \exp\left(-\frac{(x_i-\hat{\mu}_j)^2}{2\sigma^2}\right) $$ then the product becomes $$ \prod_{j=1}^k \prod_{i=1}^{n_j} \exp\left(-\frac{(x_i-\mu_j)^2}{2\sigma^2}\right) = \exp\left( \sum_{j=1}^k \sum_{i=1}^{n_j} -\frac{(x_i-\hat{\mu}_j)^2}{2\sigma^2} \right) $$ apply logarithm $$ \log\mathcal{L}_1=-n\log(\hat{\sigma}\sqrt{2\pi})-\left( \sum_{j=1}^k \sum_{i=1}^{n_j} \frac{(x_i-\hat{\mu}_j)^2}{2\hat{\sigma}^2} \right) $$ Let's call $\hat{\sigma}^2_j$ the variance of the sample $j$; we have that $$ \log\mathcal{L}_1=-n\log(\hat{\sigma}\sqrt{2\pi})-\left( \sum_{j=1}^k \frac{n_j \hat{\sigma}_j^2}{2\hat{\sigma}^2} \right) $$ by derivating and equating to $0$ you get that $$ \hat{\sigma}^2=\sum_{j=1}^k\frac{n_j}{n}\hat{\sigma}_j^2 $$ and we get $$ \log\mathcal{L}_1=-n(\log(\sum_j n_j/n \hat{\sigma}^2_j \sqrt{2\pi})+1/2) $$ The likelihood ratio, can be computed as the substraction of the two log-likelihoods: $$ \log(\mathcal{L}_1/\mathcal{L}_0)=-n\log\left( \frac{ \frac1n \sum_{i=1}^n (x_i-\mu)^2 }{ \frac1n \sum_{j=1}^k \sum_{i=1}^{n_j}(x_i-\mu_j)^2}\right) $$ You can replace the values of $\mu$, you get eventually to $$ \Lambda = e^n\exp\left(1-\frac{\frac1{n^2}\sum_{j=1}^k\frac{n}{n_j}\left(\sum_{i=1}^{n_j}x_i\right)^2-\frac1{n^2}\left(\sum_{i=1}^nx_i^2\right)}{\frac1n\sum_{i=1}^nx_i^2-\frac1{n^2}\left(\sum_{i=1}^nx_i\right)^2}\right) $$

Solved – Maximum likelihood estimate of two random samples from poisson distribution with means $\lambda\alpha$ and $\lambda\alpha^2$

It is implied the $X_i$ are independent of the $Y_j.$ Therefore the usual maximum likelihood equations apply to the $X_i$ and the $Y_j$ separately, with solutions

$$\begin{cases} \hat\lambda \hat \alpha\ n_1 &= \sum_{i=1}^{n_1}X_i &=x \\ \hat\lambda \hat \alpha^2n_2 &=\sum_{j=1}^{n_2}X_i &=y \end{cases}$$

yielding

$$\hat\alpha = \frac{y/n_2}{x/n_1}\tag{*}$$

provided $x \ne 0;$ that is, assuming at least one $X$ event was observed. Note that $\lambda$ needn't be known and that the equation for $\hat\alpha$ really reduces to a linear one, not a quadratic one.

Simulation bears out the correctness of this solution. Since MLE is an asymptotic procedure, we don't want to test the results for small $n_1,n_2.$ This example of applying $(*)$ to 100,000 independent datasets uses $n_1=24, n_2=9$ with $\alpha=\pi$ (plotted as a gray vertical line) and $\lambda=10.$ The average estimate is plotted as a red vertical line: that the two vertical lines are nearly coincident indicates any bias is low.

This is the R code used to produce the figure. NB In this simulation, no individual estimate $\hat \alpha$ was undefined. When the expectation of $x$ (namely, $\lambda \alpha n_1$) is small, the values of $x$ in some simulations can be zero.

n <- c(24, 9)
n.sim <- 1e5
lambda <- 10
alpha <- pi
set.seed(17)

xy <- matrix(rpois(sum(n)*n.sim, rep(c(lambda*alpha, lambda*alpha^2), n)), ncol=n.sim)
x <- colSums(xy[1:n[1], ])
y <- colSums(xy[-(1:n[1]), ])

alpha.hat <- y/n[2] / (x/n[1])
alpha.hat <- alpha.hat[!is.infinite(alpha.hat)]
hist(alpha.hat, xlab=expression(hat(alpha)), ylab="", cex.lab=1.5, 
     main="Histogram of Simulated Estimates")
abline(v=alpha, col="Gray", lwd=2)
abline(v=mean(alpha.hat), col="Red", lwd=2)

Best Answer

Related Solutions

Maximum Likelihood Estimates – Finding Maximum Likelihood Estimates for Multiple Normal Populations

Solved – Maximum likelihood estimate of two random samples from poisson distribution with means $\lambda\alpha$ and $\lambda\alpha^2$

Related Question