[Math] Maximum likelihood estimator for general multinomial

estimation-theoryprobabilityprobability distributionsstatistical-inferencestatistics

Let $(X_1,\ldots,X_r)\sim\text{multinomial}(n,(p_1,\ldots,p_r))$, where $p_r=1-p_1-\cdots-p_{r-1}$.

The random likelihood is $Ap_1^{X_1}\ldots p_r^{X_r}$, for some non-zero $A$.

The random log-likelihood is $C+X_1\ln(p_1)+\cdots+X_r\ln(p_r)$, and the gradient w.r.t. $(p_1,\ldots,p_{r-1})$ is

$$\left(\frac{X_1}{p_1}-\frac{X_r}{p_r},\ldots,\frac{X_{r-1}}{p_{r-1}}-\frac{X_r}{p_r}\right)$$

So there is a stationary point when $\exists t\in \mathbb{R}$ s.t. $\forall k=1,\ldots,r,\frac{X_k}{p_k}=t$, and this implies:

$$1=\sum_{k=1}^r p_k=\frac{1}{t}\sum_{k=1}^r X_k=\frac{n}{t}, \text{and so }t=n.$$

We then conclude that the maximum likelihood estimator for $(p_1,\ldots,p_{r-1})$ is $\frac1n(X_1,\ldots,X_{r-1})$.

How do we make this conclusion rigorously? How do we know that the stationary point is a maximum? Is there an easy way to show that the Hessian is negative definite? Even if we do show this, how do we check that it is not just a local maximum?

Best Answer

Since $R:=\{(p_1,\ldots,p_r):p_1+\cdots+p_r=1\}$ is compact, there is a global maximum of the likelihood function $L:=Ap_1^{X_1}\ldots p_r^{X_r}$. $L$ is non-negative, and is 0 everywhere on the boundary of $R$. Let $L_0$ be the likelihood at the point $\frac1n(X_1,\ldots,X_{r-1})$.

If $L$ is 0 everywhere in $R$, then $L_0=0$ is a trivial global maximum.

If $L$ is non-zero somewhere in $R$, then the global maximum does not occur on the boundary, and would therefore be a local maximum in the interior, which will also be a local maximum of the log-likelihood function, which is differentiable in the interior of $R$. Hence the global maximum must occur at the only stationary point $\frac1n(X_1,\ldots,X_{r-1})$ with likelihood $L_0$.