[Math] Maximum likelihood and fisher information of uniform and binomial

probabilityprobability distributionsstatistics

The MLE for a uniform distribution is at the corner and not where the FOC equal 0 (i.e. not where the derivative of the log likelihood equals 0) because the function is strictly monotonic. Hence, the MLE for unif(a,b) is min $z_i$ and for b it's max $z_i$.

(1) Does it make sense to ask what the fisher information matrix is for the MLE for a uniform distribution? If yes, what is it?

(2) Is it possible to calculate the MLE for N when $z_i \sim binom(N, p) $ and both p and N are unknown? I suspect again that the MLE for N is at the corner.

Edit1: I initially thought that the corner solution for (2) would be $max$ $z_i$. I now realize this is not correct.
Edit2: Related to How to find a confidence interval for a Maximum Likelihood Estimate and How many books are in a library?

Best Answer

Regarding the second question, assume the we have a sample of size $m$, of i.i.d Binomials $Bin(n,p)$, $\mathbb z=\{z_1,...,z_n\}$. Then the joint likelihood of the sample is

$$L(n,p\mid \mathbb z)=\prod_{i=1}^m {n \choose z_i}p^{z_i}(1-p)^{n-z_i}$$

Note that in our case where $n$ is considered unknown, it is not appropriate to "merge" the $m$ binomials into one Binomial (as is usually done)- because one will thus lose sight of the restrictions on the value of $n$.
Following the usual procedure, setting the first partial derivative of the log-likelihood w.r.t $p$ equal to $0$, we obtain the familiar

$$\hat p_{MLE} =\frac 1{nm}\sum_{i=1}^m z_i = \frac {\bar z} {n}$$

which is also the Method-of-Moments estimator of $p$. Inserting $\hat p_{MLE}$ in the likelihood, it becomes

$$L(n,\hat p_{MLE}\mid \mathbb z)=\prod_{i=1}^m {n \choose z_i}\left(\frac {\bar z} {n}\right)^{z_i}\left(\frac {n-\bar z} {n}\right)^{n-z_i}$$

Due to the existence of the factorial (and the logical relation between $n$ and each $z_i$), we have the restriction

$$\hat n \ge \max_i\{z_i\}$$

The MLE of $n$ will be $\max_i\{z_i\}$ if $L(n,\hat p_{MLE}\mid \mathbb z)$ is a decreasing function of $n$, at least for $n \ge \max_i\{z_i\}$. If this cannot be established, then you have a non-linear integer-programming problem in your hands.

If this is the MLE of $n$, it will have a finite-sample downward bias (which should be intuitively clear), and be consistent.

One can always obtain a Method-of-Moments estimator for $n$, using the sample analogues of the moment equations $E(Z_i) =np$ and $\operatorname{Var}(Z_i) = np(1-p)$:
We have $$\hat p_{MoM} = \frac {\bar z} {\hat n_{MoM}}$$ and ($s^2=$ sample variance, bias corrected)

$$s^2 = \bar z\left(1-\frac{\bar z}{n}\right) \Rightarrow \hat n_{MoM}= \frac {\bar z^2}{\bar z - s^2}$$

(actually some floor-, ceiling- or nearest-integer function of the RHS since $n$ should be an integer). As a toy example, assume you have sample of size $m=50$, with distinct values $\{1,2,3,4,5\}$, each appearing $10$ times in the sample. Then $\bar z= 3$, $s^2 = 100/49 \approx 2.04$ and

$$\hat n_{MoM} = 9,\;\; \hat p_{MoM} = 1/3$$

Compare this with $\max_iz_i = 5$.