[Math] Finding the MLE of Multinomial distribution using observed data (2 sets of observations)

maximum likelihoodoptimizationprobabilitystatistical-inferencestatistics

A trial has three possible outcomes: $1, 2$, and $3$, and the probabilities of getting
outcome $1, 2$, and $3$ are $p_1, p_2 $ and $1 − p_1 − p_2$ , respectively.
Suppose that we observed two sets of independent outcomes from the trial. For the first
set of observations, we have:

$x_1$ = number of trials with outcome 1 in the first set of observations = $5$

$x_2$ = number of trials with outcome 2 in the first set of observations = $3$

$x_3$ = number of trials with outcome 3 in the first set of observations = $2$.

For the second set of observations, which is independent from the first set, we have:

$x_4$ = number of trials that do not have outcome 3 in the second set of observations = $9$,

$x_5$ = number of trials with outcome 3 in the second set of observations = $3$.

What is the likelihood function of $θ=(p_1 , p_2)$ based on the observed data? Obtain the
analytical solution of the MLE and compute its numerical value based on the observed data

I'm confused by the fact that there are 2 sets of observations, how do I approach this.

Best Answer

Likelihoods are multiplicative whenever they arise from independent samples. Therefore, if you know how to construct a likelihood for each individual set of observations, you simply multiply them together to obtain a likelihood of the joint set of observations. The subtlety here is that the likelihood of the second set requires a little more care to construct.


The likelihood arising from both sets of observations is $$\mathcal L(p_1, p_2 \mid \boldsymbol x) \propto p_1^{x_1} p_2^{x_2} (p_1 + p_2)^{x_4} (1-p_1-p_2)^{x_3+x_5},$$ hence the log-likelihood is $$\ell(p_1, p_2 \mid \boldsymbol x) = x_1 \log p_1 + x_2 \log p_2 + x_4 \log (p_1 + p_2) + (x_3+x_5) \log (1 - p_1 - p_2).$$ The partial derivative with respect to $p_1$ is $$\frac{\partial\ell}{\partial p_1} = \frac{x_1}{p_1} + \frac{x_4}{p_1+p_2} - \frac{x_3+x_5}{1-p_1-p_2},$$ and similarly for $p_2$ we have $$\frac{\partial\ell}{\partial p_2} = \frac{x_2}{p_2} + \frac{x_4}{p_1+p_2} - \frac{x_3+x_5}{1-p_1-p_2}.$$ The log-likelihood is maximized for some $(\hat p_1, \hat p_2)$ for which the partial derivatives equal zero, so subtracting second equation from the first gives $$\frac{x_1}{p_1} - \frac{x_2}{p_2} = 0,$$ or $x_2 p_1 = x_1 p_2$. Substituting back into the first equation gives $$\begin{align*} 0 &= x_1 \left(\frac{1}{p_1} + \frac{x_4}{(x_1+x_2)p_1} - \frac{x_3+x_5}{x_1 - (x_1+x_2)p_1} \right) \\ &= x_1 \left( \left( \frac{x_1 + x_2 + x_4}{x_1+x_2} \right)\frac{1}{p_1} - \frac{x_3+x_5}{x_1 - (x_1+x_2) p_1} \right). \end{align*}$$ Dividing both sides by $x_1$ and rearranging terms gives $$\left(\frac{x_1+x_2}{x_1+x_2+x_4}\right) p_1 = \frac{x_1 - (x_1+x_2)p_1}{x_3+x_5},$$ or $$(x_1 + x_2) \left( 1 + \frac{x_3 + x_5}{x_1+x_2+x_4}\right)p_1 = x_1,$$ and we finally get $$\hat p_1 = \frac{x_1 (x_1+x_2+x_4)}{(x_1+x_2)(x_1+x_2+x_3+x_4+x_5)}.$$ Similarly, $$\hat p_2 = \frac{x_2 (x_1+x_2+x_4)}{(x_1+x_2)(x_1+x_2+x_3+x_4+x_5)}.$$ (We will forgo the verification that this critical point is indeed a maximum.)

Related Question