Maximum Likelihood Estimate for 2 Coins Combination (Bernoulli Trials)

maximum likelihoodprobability

Given:

  • 2 coins: $C_1$ and $C_2$
  • $p$: probability of choosing $C_1$ to flip.
  • $p_1$: probability of heads landing on $C_1$.
  • $p_2$: probability of heads landing on $C_2$.
  • All trials are Bernoulli. Which means, $P($choosing $C_2) = (1-p)$, $P($getting tail on $C_1) = (1-p_1)$, $P($getting tail on $C_2) = (1-p_2)$

I was trying to devise a formula for $p$, $p_1$ and $p_2$ using the maximum likelihood estimate technique.

My professor gave me hint to use a hidden random variable $Z$, where $Z$ is 1 when $C_1$ is chosen and 0 when $C_2$ is chosen. Then he asked me to maximize the following equation:

$$ \prod_{i=1}^N \left[p p_1^{x_i}(1-p_1)^{1-x_i}]^{z_i}[(1-p)p_2^{x_i}(1-p_2)^{1-x_i}\right]^{1-z_i}$$

Can somebody please explain me how the above equation came?

Best Answer

The idea is that you perform the experiment with $N$ observations, the $i^{\rm th}$ observation is an ordered pair $(z_i, x_i)$, where $z_i = 1$ if $C_1$ is chosen, and $z_i = 0$ if $C_2$ is chosen; then $x_i = 1$ if the outcome of the flip is heads, and $x_i = 0$ if tails. So your sample is a vector of ordered pairs; alternatively, you can think of it as two vectors of length $n$: $$\boldsymbol z = (z_1, \ldots, z_N), \quad \boldsymbol x = (x_1, \ldots, x_N).$$

Now, you have in essence a Bernoulli/Bernoulli hierarchical model: the choice of which $p_i$ to take depends on a second Bernoulli outcome with parameter $p$. So your likelihood will look like a composition of Bernoulli probability mass functions. For each $i$ such that $z_i = 1$, the coin $C_1$ is flipped, and the outcome of that flip is heads with probability $p_1$, and not heads with probability $1-p_1$; therefore, each such outcome contributes $$p_1^{x_i} (1-p_1)^{1-x_i}$$ to the likelihood. But this only happens when $z_i = 1$ with probability $p$; thus the subset of the sample for which $C_1$ is tossed contributes a total of $$\prod_{i=1}^N (p \cdot p_1^{x_i}(1-p_1)^{1-x_i})^{z_i}.$$ When $z_i = 0$ with probability $1-p$, $C_2$ is tossed and heads are obtained with probability $p_2$, thus this portion of the likelihood looks like $$\prod_{i=1}^N ((1-p)p_2^{x_i}(1-p_2)^{1-x_i})^{1-z_i}.$$ Put together, the total likelihood is $$\mathcal L(p, p_1, p_2 \mid \boldsymbol z, \boldsymbol x) = \prod_{i=1}^n (p \cdot p_1^{x_i} (1-p_1)^{1-x_i})^{z_i} ((1-p) p_2^{x_i}(1-p_2)^{1-x_i})^{1-z_i}.$$

Related Question