Estimation – How to Prove MLE of p in Binomial Distributions with Invariance

binomial distributionestimationinvariancemaximum likelihoodrandom variable

Suppose $X_1 \sim \text{binom}(p_1,n_1)$ and $X_2 \sim \text{binom}(p_2,n_2)$, where $X_1$ and $X_2$ are independent, and let $p = p_1 – p_2$. How can I prove that $\hat{p} = \hat{p}_1 – \hat{p}_2$? (Here, $\hat{\theta}$ denotes the MLE of $\theta$.) Initially, I tried to calculate the distribution of $X_1 + X_2$, but this turned out to be quite messy: Letting $Y = X_1 + X_2$, I found that for $y = 0,1,\ldots,n_1 + n_1$,
\begin{align*}
P(Y = y) &= P(X_1 + X_2 = y) \\[4pt]
&= \sum_{x=0}^{n_1} P(X_1 + X_2 = y|X_1 = x)P(X_1 = x) \\[4pt]
&= \sum_{x=0}^{n_1} P(x + X_2 = y) P(X_1 = x) \\[4pt]
&= \sum_{x=0}^{n_1} P(X_2 = y – x) P(X_1 = x) \\[4pt]
&= \sum_{x=0}^{n_1} {n_2 \choose y – x} p_2^{y-x} (1-p_2)^{n_2 – (y-x)} {n \choose n_1} p_1^{x} (1-p_1)^{n_1 – x}
\end{align*}

At this point I couldn't really recognize $Y$ as having a familiar distribution. Also, it's not clear to me if the likelihood function for $Y$ depends on $p_1$ and $p_2$ only through the difference $p = p_1 – p_2$, so I'm not clear on how to obtain the MLE for $p$.

I came across a post from 2015 on the same question, but it didn't have an answer. Also, one of the commenters on that post linked to this post on the invariance property of the MLE:

If $\hat{\theta}$ is the MLE of $\theta$, then for any function $\tau(\theta)$, the MLE of $\tau(\theta)$ is $\tau(\hat{\theta})$.

(I'm quoting Theorem 7.2.10 of Casella and Berger's Statistical Inference.) However, I do not see how this particular theorem applies to my question, since I'm interested in the MLE of a function of two parameters associated with two independent random variables. Now if there was an invariance theorem that said given independent random variables $X_1 \sim f(x_1|\theta_1)$ and $X_2 \sim g(x_2|\theta_2)$, we have $\widehat{\tau(\theta_1,\theta_2)} = \tau(\hat{\theta}_1,\hat{\theta}_2)$, then I would agree that $\hat{p} = \hat{p}_1 – \hat{p}_2$, but I don't believe that this is implied by the above theorem…Does such an invariance theorem exist? If not, how should I go about finding the MLE of $p$?

Best Answer

I'll show you how the invariance property of the MLE applies to this case. Consider the joint distribution of the two random variables, which depends on the parameter vector $\mathbf{p} = (p_1,p_2)$. Suppose you find the MLE of this parameter vector via the joint distribution, and let's denote this MLE as:

$$\hat{\mathbf{p}}_\text{MLE} = (\hat{p}_1, \hat{p}_2).$$

Now, let's define the function $\tau$ so that $\tau(a,b) = a-b$. Applying this function to both the underlying parameter vector, and its MLE we get:

$$\begin{align} \tau(\mathbf{p}) = \tau(p_1, p_2) &= p_1 - p_2, \\[6pt] \tau(\hat{\mathbf{p}}_\text{MLE}) = \tau(\hat{p}_1, \hat{p}_2) &= \hat{p}_1 - \hat{p}_2. \\[6pt] \end{align}$$

Now, the invariance property of the MLE says that the MLE of $\tau(\mathbf{p})$ is $\tau(\hat{\mathbf{p}}_\text{MLE})$, which is equivalent to saying that the MLE of $p_1 - p_2$ is $\hat{p}_1 - \hat{p}_2$. (Note that your broader assertion that the joint distribution depends on $\mathbf{p}$ only through $p_1-p_2$ is false.)

Related Question