[Math] Maximum likelihood estimators, hypergeometric and binomial

estimation-theoryoptimizationprobability distributionsstatistics

I'm trying to solve a two part problem. The set up is as follows: consider a bag with $\theta$ red marbles and $7-\theta$ blue marbles, with $\theta$ being unknown. Let $x$ denote the number of red marbles found in a sample of 3. If we sample without replacement, what is a maximum likelihood estimator for $\theta$ (the number of red marbles), based on our sample?
If we sample with replacement, what is the maximum likelihood estimator for $\theta$. I'd also like to check if these estimators are unbiased, and which has the smaller variance (I'd expected sampling without replacement to be superior).

1.) Sampling without replacement yields a hypergeometric distribution, with the likelihood function $\large L(\theta)=\frac{\binom{\theta}{x}\binom{7-\theta}{3-x}}{\binom{7}{3}}$. I think what I want to do is look at the ratio $\large \hat{L(\theta)}=\frac{L(\theta)}{L(\theta+1)}$, since this should be increasing up to the point $ \hat{L(\theta)}>1$ and decreasing afterwards, and we can take the MLE to be this point of inflection. A bit of algebra shows this point to be $\frac{8x-3}{3}$. When this fraction is an integer, we can see that $ \hat{L(\theta)}=1$, so $\frac{8x-3}{3}+1$ is also an MLE (it is not unique). If it is not an integer, we take the floor and see that the MLE is $[\frac{8x}{3}]$, where the brackets represent the floor function (since $[\frac{8x-3}{3}]<[\frac{8x}{3}]$, we only have one MLE in this case).

2.) The binomial case is more confusing to me, though perhaps I'm just thinking about it incorrectly. I can easily take the maximum likelihood estimate of P (which is, in this case, $\frac{\theta}{7}$), and show that this MLE is simply $\frac{x}{3}$ (this follows from taking the derivative of the log likelihood and setting it to zero). Then solving the equation $\frac{\theta}{7}=\frac{x}{3}$ yields $\theta=\frac{7x}{3}$ (this result is similar to the hypergeometric result, which is reassuring). It is not, however, a whole number and I can't tell if I should take the floor or the ceiling in this case (or perhaps both provide MLEs)?

Finally, assuming that my logic is sound up to this point, I'm not sure how to check whether either of these are unbiased, or how to compare the variances (I think this should be relatively easy – I'm just drawing a blank!)

Thanks so much for reading – any help is greatly appreciated!

Best Answer

Let $X_i$ be the number of red marbles on the $i$th draw, so $X_i$ is either $0$ or $1$. Let $X=X_1+X_2+X_3$, so that $E(X)$ is the sum of three expected values. By symmetry, all three expected values are equal. So $E(7X/3) = \frac73 \cdot3\cdot E(X_1)$ $=7\cdot\Pr(\text{red on first draw})$ $=7\cdot\frac\theta7$ $=\theta$. Therefore $7X/3$ is an unbiased estimator of $\theta$ regardless of whether you're drawing with or without replacement.

(As to the MLE in the without-replacement case, I'll be back later . . . . .)